The module for this lesson is still under development. Please contact us if you have any questions.

 

Beware of Input Buffer Misbehavior & Make Your Code Behave – CS1

 

Read Background
Execute Lab Assignment
Complete Security Checklist
Answer Discussion Questions

Background

top

Summary:

Buffer vulnerabilities cause software security problems and lead to buffer attacks causing code misbehavior. This hands-on module in secure coding focuses on how to use defensive programming to protect high-level code from buffer vulnerabilities arising from some in-built library modules. Some instances of such input buffer vulnerabilities in a standard high-level programming language like C are introduced and demonstrated by this exercise.

Description:

Reading characters using certain library functions in C can result in potential security problems while coding. The getchar() and getc() are examples of such library functions, which are vulnerable to ‘input buffer misbehavior’ issues, involving the standard input stream (stdin), leading to incorrect inputs and/or input skipping. This lab module will familiarize students with such internal buffer vulnerability problems. It will enable students to learn how to handle these issues and to understand the preferred/recommended programming practices in such situations. Overall, this hands-on exercise teaches students how to code securely and responsibly in ‘input buffer misbehavior’ scenarios involving standard C library. The problem of buffer misbehavior is common in the C/C++ languages because they expose low level representational details of buffers as containers for data types. Thus, buffer misbehavior issues need to be avoided through secure coding (by maintaining a high degree of correctness in code), which performs buffer management. It has also long been recommended to avoid a list of standard C/C++ library functions, which may lead to buffer misbehavior issues. This list includes getc, getchar, gets, scanf and strcpy.

Risk – How Can It Happen?

Most input buffer vulnerability problems in C can be traced directly back to the standard C library. Risky cases include the problematic character reading operations using library functions that do not perform argument checking. One instance of suck a risky element is reading a sentinel character through getchar() or getc() to determine whether to continue with a program loop execution or not. In the above mentioned instance, where the whole program’s performance is dependent on the operational success of a library function, a lot is at stake. If the above function call fails, then that produces logical errors and/or runtime exceptions, which lead to wrong output. C programs written today still suffer from not using these function calls properly because developers are not quite taught how to code securely for coping with ‘input buffer misbehavior’ issues. Some programmers pick up a hint here and there, but even the good ones can become victims in this regard. They may use their own checks on the arguments of library functions and may wrongly reason that the use of a ‘potentially dangerous’ function is ‘safe’ in some particular cases.

Examples of Occurrence

On Nov 2, 1988 one of the first computer worms was released and distributed via the internet. This came to be known as the Morris worm. It is considered to be the first internet worm and was certainly the first to gain significant mainstream media attention. This Morris worm exploited the buffer misbehavior of a standard C library function, called gets(), in fingerd. It also resulted in the first conviction in the US under the 1986 Computer Fraud and Abuse Act. It was written by a student at Cornell University (Robert Tappan Morris) and was launched from MIT. According to its creator, the Morris worm was not written to cause damage, but to gauge the size of the Internet. However, the worm was released from MIT to disguise the fact that the worm originally came from Cornell. The Morris worm worked by exploiting known vulnerabilities in UNIX sendmail, finger & rsh/rexec and targeted as weak passwords as well. A portable “C’ code component of the worm was used to pull over (download) the main body and that ran on the infected systems, loading them down and making them peripheral victims. A supposedly unintended consequence of the above code, however, caused the worm to be more damaging. A computer could be infected multiple times through this worm and each additional process would slow the machine down, eventually to the point of being unusable. This had the same effect as a fork bomb and crashed the computer. The critical error that transformed the worm from a potentially harmless intellectual code exercise into a virulent denial of service attack was in the spreading mechanism. The worm could have determined whether to invade a new computer by asking if there was already a copy running. However, just doing this would have made it trivially easy to kill; everyone could just run a process that would answer “yes” when asked if there was already a copy and the worm would stay away. Morris directed the worm to copy itself even if the response was “yes”, 1 out of 7 times. This level of replication proved excessive and the worm spread rapidly, infecting some computers multiple times. Morris remarked, when he heard of the mistake, that he “should have tried it on a simulator first.”

Examples in Code

Let’s look at a simple C program example, where the above mentioned ‘input buffer misbehavior’ issues can occur-

Code Example:

/* This sample C program where an user tries to guess character by character a preset password of 6 characters and prints whether the user guessed the corresponding character correctly or not after each guess using a sentinel-controlled while loop, which also checks if the user wants to continue or not at the end of each guess */

#include

void main()

{

            char password[7] = “secret”;

            char guess[7];

            char cont = ‘Y';

            int count = 0;

    while ((cont == ‘Y’ || cont == ‘y’) && (count < 6))

    {

                        printf(“Enter password character # %d: “, count + 1);

                        guess[count] = getchar();

                        if (guess[count] == password[count])

            {

                                    printf(“You guessed correctly! “);

                                    count++;

                        }

            else     printf(“You did not guess correctly! “);

                        if (count < 6)

                        {

                                    printf(“\nDo you want to continue? (Enter uppercase/lowercase Y for Yes, Enter any other character for No): “);

                                    cont = getchar();

                        }

                        else     printf(“You are done! “);

     }

}

            Here, in the above program example, there’s a problem while reading the ‘cont’ character input during the first do-while iteration to decide whether to continue execution or not. This is because the getchar() library function tends to read its input directly from the stdin (standard input), which contains character arguments corresponding to all keyboard strokes, including the character corresponding to the ‘Enter’ key, which the user hits while entering the first grade. Hence, the above code example abruptly ends the program without giving the user the opportunity to enter the option of “Yes” for continuing the program as it assumes the very first time that the user has entered a different character than ‘Y’ or ‘y’. Thus, this is an instance of ‘input buffer misbehavior’ due to internal buffer vulnerability associated with the getchar() library function, which doesn’t do argument checking for valid character inputs against all user keystrokes.

Lab Assignment

For this hands-on secure coding exercise, refer to the above code example in code example section and study it carefully. Based on your study, answer the following questions:-

Question 1: After you have studied the above code example, try to understand the intention of the program. What is the objective of the program? What is it that the code is meant to do??

Question 2: Run the given code example and observe the program outcome. What do you see when you execute the code? Report the program’s output. Does the observed results tally with the expected outcome?? Share your thoughts on that.

Question 3: Given your observation in regard to testing the program (according to Question 2), what is wrong with the code that prevents it from performing the way it is supposed to? Are there any bugs/defects in the code that make it behave the way it does when you run it?? Share your views. [Hint: Look into the input operations of the program using getchar(). Try to understand how getchar() works and what is it doing exactly in context of the given program!]

Question 4: What happens when you replace the getchar() function references in the given program with getc()? Do you see any improvements in the program outcome or the way it behaves? Or does it produce the same output as before?? Report your findings. What can you conclude from these observations on the behavior of both getchar() and getc()? [Hint: Look into the getc() function. Try to understand how it works and what is it doing exactly in context of the given program!]

Question 5: Try and see if you can find a way to solve the problem with the given code. There are multiple ways by which you can address the issue with the program. Try to correct the code for making it work right and describe the technique you used to rectify the problem. [Hint: Replace the getchar() function calls of the program with the scanf() library function in C. Does that make the program work correctly? Look into the scanf() function. Try to understand how it works and what it can exactly do to help in context of the given program!]

Question 6: Once you find the problem with the given code and discover one technique of solving the problem (refer to the hint in “Risk-How can it happen” section), you should realize that there are other ways of remedy as well. One way of addressing the concerned issue without removing the getchar() function from the code is using the fflush(stdin) instruction. Try and see if you can figure out where to add the fflush(stdin) function call to the code to correct it by retaining the getchar() operations. [Hint: Look into the fflush(stdin) function command. Try to understand how it works and what it can exactly do to help in context of the given program!]

Security Checklist

top

Security Checklist

Vulnerability: Input Buffer Misbehavior Course: CS1  
1. Check each line of code  
1. 1.1 Find and underline/circle each input operation  
2. For each input operation, which of the following is applicable?  
2. 2.1 Check type of input data (char or not).  
3. 2.2 Check library function involved (getchar, getc, gets, scanf, etc.)  
3. 2.3 Check format (%c conversion specifier, escape sequence “\n”)  
Highlighted areas indicate vulnerabilities!  

Discussion Questions

top
  1. Now that you have completed the Section 9 Lab Assignment, what is your take on the overall art of defensive programming? Why do you think it is important to code securely and responsibly?? Briefly describe the significance of secure coding and its impact/effect.
  2. What do you understand by ‘input buffer misbehavior’?
  3. How serious is ‘input buffer misbehavior’ as a security threat? What can go wrong due this type of vulnerability with and what can result from it??
  4. If a method throws a checked exception, how can the calling method avoid coding a try/catch block to handle the exception? In what cases might the calling method use this option?
  5. Given that you have now experienced an instance of ‘input buffer misbehavior’ through the code example in Example Code Section , can you simulate another instance of a code example (your own) containing ‘input buffer misbehavior’ as the example code Section code example? Try and see if you can generate a code example of yours.
 
Copyright © Towson University