Introduction

This is a custom math library for Arduino, which should be more efficient than the standard C math library. This library is primarily designed for my Arduino Hexapod robot and Quadruped robot, but you might also find it useful in other application. At the moment I have come up with functions: sin(), cos(), acos(), atan2(), which has proven to be faster. Other function implementations might not be faster than the standard built-in ones, but it’s interesting to see how they can be implemented in programming language.

Some of the links on this page are affiliate links. I receive a commission (at no extra cost to you) if you make a purchase after clicking on one of these affiliate links. This helps support the free content for the community on this website. Please read our Affiliate Link Policy for more information.

This custom maths library runs 10+ times faster than the standard built-in library when running on my robots (measuring the relatively math intensive Inverse Kinematics algorithm running time). Comparing to the old maths library I also get an 10% improvement, see result sections for detail.

The source code is not fully available on this page, please request through comment. Part of the implementation is at the end of this post.

Custom Math Library Development History

I have done a basic custom maths library before, where I used look-up tables for sin() and cos() functions, and a combination of polynomial and look-up table for the acos() function.

In this enhanced version, I have come up with some alternative solution for these functions and other maths functions, which I will refer to the ‘New’ ways in this post. They are not necessary better than the old ways, so I will be comparing them in the next section. I will put these ‘New’ functions implementations at the end of the post.

Benchmarking: the Built-in Way, Old Way and New Way

The testings will be carried out on:

New sqrt function – exploiting bit shift operation
New sin/cos function – using int look-up table with 0.5 degree interval (double look-up table size of the old way)
New acos function – only using a look-up table (byte values)
New atan2 function – using sqrt() and acos()
Performances of Using int/long and float on Arithmetical Operation (addition, multiplication and division)

All tests are carried out using the Arduino Mega 2560 board, because that’s the environment I will be using this custom library in.

Results

Some of the results came back as expected, and some didn’t. The performance really depends on the hareware architecture and varies from platform to platform. All results were measured in micro seconds (us)

Table of Contents

sin() and cos()

Both functions were run 2 million times with a random input between 0 to 360 degree, for built-in way and new way. I am not testing the old way because the new way is almost the same as the old way except the look-up table size is now doubled.

Built-in way took: 23,273,248 us
New way took: 2,524,220 us

acos()

All three ways were tested: built-in way VS old-way (polynomical + look-up) VS new-way (pure look-up+mul+div). The acos() functions were run 100 000 times with a random input between 0 and 1.

Built-in way: 52,245,320 us
Old way: 38,648,904 us
New way: 19,225,740 us

atan2()

Built-in way VS New way ( sqrt()+acos() ). This function were run 100,000 times with two random float numbers between -10.0 and 10.0.

Built-in way: 6,163,108 us
New way: 5,685,292 us

sqrt()

Built-in way VS New way (whole number bit shift operation).

I lost the exact result data to this one, but the outcome clearly shown the built-in way is still far better than the new implementation.

Arithmetical Operation with Float and Long/Int

This one is a interesting one. In theory, arithmetical operations should be faster with long and integer data than float generally. But when using long/int to represent numbers after the decimal point, a number is multiplied by a scaler such as 1000, depends on how many decimal places we want, for example to keep 3 decimal places 3.203 is now 3203. When this scaled number is multiplied or divided by another scaled number with the same precision (for example 1.123 -> 1123), we need to divide 1000 after the multiplication, or times 1000 before the division. And that introduces overhead to the computation, we could just have this:

3.203 * 1.123 OR 3.203 / 1.123

but now with scaled int/long presentation:

3203 * 1123 / 1000 OR 3203 * 1000 / 1123

For testing, all numbers used are random. Each testing were run 20,000 times.

1 add 1 mul 1 div
- float : 1,019,000 us
- long: 888,120 us
3 add 1 mul 1 div
- float: 1,355,700 us
- long: 919,560 us
1 add 3 mul 1 div
- float: 1,398 700 us
- long: 1,267,352 us
1 add 1 mul 3 div
- float: 2,299,000 us
- long: 2,448,716 us

The long data type performed quite well with low number of arithmetical operations and higher number of additions. But as expected, the more multiplications and divisions, the closer it gets to performance of the float type. It even became worse than float when there are more divisions involved.

Obviously the more operations we do, the more computationally expensive it gets, which will not do us any good. Besides, in the Arduino Architecture, float and long both take up 32 bits, so there is no advantage of replacing float with long data type memory wise. Therefore we will stick to float when needed.

The same applies to the look-up table. Some might suggest to use Integer type data for it, which is 16 bits Verse 32 bits for float (currently using). But if the output of the trigonometry functions are float, that means we will have to do conversions (probably with divisions involved) before output. This will kill the performance. So whenever is possible, I should stick to float type in the look-up table so I can use the data directly.

Other Note

One thing we should not do is to use other platforms other than Arduino itself for performance testings. Because it varies from architectures to architectures.

During the process of researching, I also found some useful stuff about Arduino:

PROGMEM – Where you can store variables in flash memory rather than RAM. It’s useful when you have run out of space in RAM and still have a large amount of data you need to store somewhere. But it’s quite slow compared to RAM.
Different type of memory in Arduino boards – MAX Variable Arduino can hold (look-up table) in RAM. Variables are stored in “SRAM”.

If you want to discuss or share your ideas, you can post something in our forum here.

New Way Implementations:

sin()

[sourcecode language=”cpp”]
float fsin(int deg){

float result = 0;
int sign = 1;

if (deg < 0){
deg = -deg;
sign = -1;
}

while (deg>=3600)
deg -= 3600;

// 0 and 90 degrees.
if((deg >= 0) && (deg <= 900))
result = SIN_TABLE[deg / 5];

// 90 and 180 degrees.
else if((deg > 900) && (deg <= 1800))
result = SIN_TABLE[(1800-deg) / 5];

// 180 and 270 degrees.
else if((deg > 1800) && (deg <= 2700))
result = -SIN_TABLE[(deg-1800) / 5];

// 270 and 360 degrees.
else if((deg > 2700) && (deg <= 3600))
result = -SIN_TABLE[(3600-deg)/5];

return sign * result;

}
[/sourcecode]

cos()

[sourcecode language=”cpp”]

float fcos(int deg){
float result = 0;
if (deg < 0)
deg = -deg;

while (deg>=3600)
deg -= 3600;

// 0 and 90 degrees.
if((deg >= 0) && (deg <= 900))
result = SIN_TABLE[(900-deg) / 5];

// 90 and 180 degrees.
else if((deg > 900) && (deg <= 1800))
result = -SIN_TABLE[(deg-900) / 5];

// 180 and 270 degrees.
else if((deg > 1800) && (deg <= 2700))
result = -SIN_TABLE[(2700 – deg) / 5];

// 270 and 360 degrees.
else if((deg >= 2700) && (deg <= 3600))
result = SIN_TABLE[(deg – 2700) / 5];

return result;
}
[/sourcecode]

acos()

[sourcecode language=”cpp”]

float facos(float num){

float rads = 0;
bool negative = false;

// Get sign of input
if(num < 0){
negative = true;
num = -num;
}

// num between 0 and 0.9.
if((num >= 0) && (num < 0.9))
rads = (float)ACOS_TABLE[(int)(num*DEC4/79+0.5)] * 0.00616;

// num between 0.9 and 0.99.
else if ((num >= 0.9) && (num < 0.99))
rads = (float)ACOS_TABLE[(int)((num*DEC4-9000)/8 + 0.5) + 114] * 0.00616;

// num between 0.99 and 1.0.
else if ((num >= 0.99) && (num <= 1))
rads = (float)ACOS_TABLE[(int)((num*DEC4-9900)/2 + 0.5) + 227] * 0.00616;

// Account for the negative sign if required.
if(negative)
rads = PI – rads;

return rads;
}
[/sourcecode]

atan2()

[sourcecode language=”cpp”]
float fatan2(float opp, float adj){

float hypt = sqrt(adj * adj + opp * opp);
float rad = om_acos(adj/hypt);

if(opp < 0)
rad = -rad;

return rad;
}
[/sourcecode]

sqrt()

[sourcecode language=”cpp”]
ulong fsqrt(ulong number){
ulong root = 0;
ulong bit = 1UL << 30;

// Bit starts at the highest power of four <= to input number.
while(bit > number) bit >>= 2;

while(bit != 0){
if(number >= root + bit){
number -= (root + bit);
root += (bit << 1);
}
root >>= 1;
bit >>= 2;
}

return root;
}
[/sourcecode]

You can now find the library in GitHub (not my account).

arduino

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Enhanced Arduino C++ Custom Math Library

Introduction

Custom Math Library Development History

Benchmarking: the Built-in Way, Old Way and New Way

Results

sin() and cos()

acos()

atan2()

sqrt()

Arithmetical Operation with Float and Long/Int

Other Note

New Way Implementations:

sin()

cos()

acos()

atan2()

sqrt()

Robotic Remote Controller Protocol Design

Put Adsense Advertisement In Middle Of The Post

Leave a Comment Cancel Reply

8 comments