Skip to main content

Computer Arithmetic

Computer Arithmetic, we discuss the floating-point numbers and then discuss the normalized floating-point numbers. After this, we discuss the arithmetic operations using normalized floating-point with examples. So that we understand with more precision. In the end, we provide a few examples for practice.

 Floating Point Representation of Numbers

First of all, we should know what is floating point numbers. As the name suggests floating point numbers contain floating decimal points. For example, 6.55, 0.0001, and

2, 345.5432 are floating point numbers. Numbers are known as integers if they do not have decimal places.

In a computer two types of arithmetic operations are available. They are

 1.    Integer arithmetic

 2.    Floating point arithmetic

 Integer arithmetic grapples with integer operands and is used as subscripts and in counting. Floating point arithmetic uses numbers with fractional parts as operands and is used in most computations. Computers are generally designed such that each location, called a word, in memory stores only a finite number of digits. Thus, all operands in arithmetic operations have only a finite number of digits.

One memory location or word

 

6

7

8

2

4

1

Table 1: A memory location storing number 6782.41

 Thus, the maximum and minimum possible numbers to be stored are 9999.99 and 0000.01 respectively in magnitude. This range is quite inadequate in practice.

For this, a new convention is adopted which aims to preserve the maximum number of significant digits in a real number and increase the range of values stored in real numbers. This representation is called the normalized floating-point mode of representing and storing real numbers.

In this mode, a real number is expressed as a combination of a mantissa and an exponent. The mantissa is made < 1 or .1 and the exponent is the power of 10 which multiplies the mantissa.


For example, The number 56.78 × 105 is represented in this notation as .5678 E 7 where E is used to represent 107. The mantissa is .5678 and exponent is 7.

The number is stored in the memory location as:


Therefore, the shifting of the mantissa to the left till its most significant digit is non-zero is called normalization.

For example, The number is .0000768 may be stored as .7680 E 4 because the leading zeroes

serve only to locate the decimal point.

The range of the numbers that may be stored are .9999 × 1099 to .1000 × 1099 in magnitude which is obviously much larger than that used earlier in fixed decimal point notation.

 

2 Arithmetic Operations Using Normalized Floating Point Numbers

2.1        Addition and Subtraction

If two numbers represented in normalized floating point notation are to be added, the exponents of two numbers must be made equal and the mantissa shifted appropriately. The operation of subtraction is nothing but adding a negative number. Thus the principles are the same.

 Example 2.1. Add the following floating point numbers.

1. .3456 E 7 and .4563 E 7

      2. .3456 E 7 and .4563 E 9

 

3. .3456 E 5 and .4563 E 9

 

4. .6457 E 5 and .4564 E 5

 

5. .6457 E 99 and .4564 E 99


Solution: 1. In this problem exponents are equal. Thus we add mantissa as follow:

 

.3456 E 7

+ .4563 E 7

= .8019 E 7

 

2.   Here, in this problem exponents are not equal. So, first, we make it equal. As the operand with the larger exponent is kept as it is and change the operand with a smaller exponent by multiplying and dividing by 102, as the difference in the exponent is 2. Therefore we get .0034 E 9

.4563 E 9

+ .0034 E 9

= .4597 E 9

 

3.     Again, in this problem exponents are not equal. So, first, we make it equal. By applying the same procedure as applied in the previous example multiply and divide by 104. Therefore we get

.0000 E 9

 

.4563 E 9

+ .0000 E 9

= .4563 E 9

 

4.   In this example we can see the exponents are equal. Therefore, we add the mantissa.

 

.6457 E 5

+ .4564 E 5

= 1.0921 E 5

 

As, we add the mantissa we get 1.0921 with 5 digits and > 1. So, it is shifted to left one place before it stored and increase the the exponent by 1. Thus .1092 E 6.

5.   Here, in this problem we can see again the exponents are equal. so we add the mantissa and we get 1.0921 again this is > 1. So like previously we shift the decimal but now the value of exponent is 100. As we know, exponent part cannot store more than 99. This condition is known as overflow condition and arithmetic unit will intimate an error condition.

Example 2.2. Subtract the following: 1. .4567 E7 and .4535 E 7


2. .8967E 5 and .3456E 4

3. .4567 E -99 and .4556 E -99

 

Solution: We apply the same concept that we have discussed in addition.

1.   In this problem exponents are equal therefore, we subtract the mantissa as follows:

 

.4567 E 7

.4535 E 7

= .0032 E 7

 

Thus we can write it as .3200 E 5.

2.   In this problem, we can see that the exponents are not equal so we have to make them equal as we discussed in the previous examples. Thus, we multiply and divide the value by 101.

.3456 E 4

.0896 E 4

= .2560 E 4

3.   Again, in problem exponent is same so subtract the mantissa as follow:

 

.4567 E 99

.4456 E 99

= .0111 E 99

For normalization, the mantissa is shifted to the left and the exponent is reduced by 1 the exponent would thus become -100 which cannot be stored. This condition is known as the underflow condition and the arithmetic unit will intimate an error condition.

 

2.2        Division

In division, the mantissa of the numerator is divided by that of the denominator. The denominator exponent is subtracted from the numerator exponent. The quotient mantissa is normalized to make the most significant digit non-zero and the exponent is appropriately adjusted.

Example 2.3. Division of the following: 1. .8867 E 2 ÷.1234 E 98

2. .7689 E 5 ÷.3456 E 56


3. .3452 E 45 ÷.6754 E 68

Solution: 1. .8867 E 2 ÷ .1234 E 98 = 7.1855 E 100 = .7185 E 101 Overflow

2. .7689 E 5 ÷ .3456 E 56 = 2.2248 E 51 = .2224 E 50

3. .3452 E 45 ÷ .6754 E 68 = .5111 E 23

 

2.3        Multiplication

Two numbers are multiplied in the normalized floating-point mode by multiplying the mantissa and adding the exponents. After the multiplication of the mantissa, the result mantissa is normalized as in addition or subtraction operation and the exponent is appropriately adjusted.

Example 2.4. Multiply the following: 1. .4567 E 31and.3456 E 12

2. .1111 E 67and.1345 E 87

3. .4563 E 56and.3452 E 44

 

4. .5673 E55and.1234 E 44

Solution: 1. .4567 E 31 ×.3456 E 12 = .1578 E 19

2. .1111 E 67 ×.1345 E 87 = .0149 E 27 = .1490 E 28

3. .4563 E 56 ×.3452 E 44 = .1575 E 100.     The result overflows.

4. .5673 E55 ×.1234 E 44 = .0700 E 99 = .7000 E 100. The result is underflow. Few questions for practice

 

Example 2.5. Represent 657.9 × 1067 in normalized floating point mode. Example 2.6. Subtract the (.9876 E 45) (.3456 E 47) floating point numbers. Example 2.7. Find the value of (1 + x)2 where (x = 0.4523 E 3).

Example 2.8. Apply all the arithmetic operations on any two normalized floating point numbers.




In computer arithmetic, floating point representation provides the following benefits:


Precision: A large range of values can be represented with different precision when using floating point numbers. Floating point numbers can handle extremely big and extremely small values, in contrast to fixed-point integers, which have a limited range. For computations in science and engineering, this flexibility is essential.
Dynamic Range: An expanded dynamic range is offered by floating-point encoding. It has normalized precision and can represent numbers between (10^{-308}) and (10^{308}) (roughly). Applications like financial computations, scientific modeling, and simulations require this range.

Normalized Form: Generally, floating point numbers are kept in normalized form, which indicates that the number with the greatest significance is not zero. The accuracy loss that occurs during arithmetic operations is reduced because of this format. For instance, in normalized form, the integer (0.00123) is stored as (1.23 \times 10^{-3}).
Effective Arithmetic Operations: Addition, subtraction, multiplication, and division are all carried out on floating-point numbers with efficiency by floating-point hardware accelerators, including FPU units. Scientific simulations, visual rendering, and other computationally demanding applications require these procedures.
Standardization: The representation and arithmetic operations for floating point numbers are defined by the IEEE 754 standard. This standard permits numerical code to be portable and interoperable across various platforms and computer languages.
Scientific Notation: The syntax for scientific notation (mantissa × base^exponent) is followed by floating-point representation. Extremely large or small numbers can be expressed more simply and manageably with this style.
Keep in mind that although floating point representation offers many benefits, it also has drawbacks, like the possibility of rounding mistakes during arithmetic operations and precision errors owing to finite precision. These trade-offs must be considered by developers while creating numerical algorithms. Please feel free to ask any further questions or for further clarification if necessary! 😊

Comments

Popular posts from this blog

Newton‑Raphson Method | Root‑finding Tutorial with Examples (GATE / Engineering Math)

What is the Newton‑Raphson Method? Derivation of the Algorithm Step-by-Step Example Convergence and Limitations Application in GATE / Engineering Maths Download PDF Notes Newton-Raphson Method:     In this article, we discuss the formula of the Newton-Raphson method, its limitations, and its advantages. Also, we provide a few solved examples and a few unsolved questions for practice.       We discuss Newton iterative formula and then solve a few questions using these iterative formulae. For practice unsolved questions are also provided. This method is generally used to improve the results obtained by one of the previous methods. This method can be derived from Taylor's series.  The formula used as follows: $x_{n+1}= x_n - \frac{f(x_n)}{f'(x_n)}$  NOTE: (1)] This method is useful in cases of large values of $f'(x)$ that is , when the graph of $f(x)$ while crossing the x-axis is nearly vertical. (2)] If $f'(x)$ is zero or nearly zero, the me...

Surface Area and Volume- Exercise 13.1 questions 1 to 5- Easy to understand

Surface area and volume- An important topic of NCERT class 10th class but students feel this chapter is very difficult because they are not relating this concept in their daily routine. In our day-to-day life, we come across a number of solids made up of combinations of two or more of the basic solids.  You may have seen an object a small test tube funnel. You would have used one in your science laboratory. This funnel is also a combination of a cylinder and a cone. Similarly, you may have seen some big and beautiful resorts made up of a combination of solids like a cylinder and hemisphere. Surface Area of a Combination of Solids: Let us consider the funnel seen above picture. How do we find the surface area of such a solid? We first try to see, if we can break it down into smaller problems, we have earlier solved. We can see that this solid is made up of a cylinder with a cone. It would look like what we have below diagram after we put the pieces all together. If we consider the s...