Sunday, June 1, 2014

little steps towards Kernel Programming

I got a chance to work on a bug which is in vfs(Probably?). Brian Foster at redhat is mentoring me. I chose to document the process here.

This post is a mail sent by Brian For getting my workspace, test setup ready before I started working on the bug.

You probably want to start out just getting comfortable with compiling a
kernel. My upstream linus tree (linus' tree) comes from here:

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

So that's the latest upstream (all sub-trees are merged into there).
This might be ultimately something that has to go into a vfs subtree,
but we can figure that out later.

My normal workflow is relatively simple. Install rawhide (fedora latest)
to a VM:

https://fedoraproject.org/wiki/Releases/Rawhide

Grab the /boot/config-x.y.z kernel config file associated with the
current running kernel and install it to your git repo as /.config. Run
'make oldconfig' to resolve any unconfigured options. You could also run
a 'make localmodconfig LSMOD=<file>' where <file> contains the 'lsmod'
output from your VM to speed up compiles (i.e., 'lsmod > /tmp/lsmod' on
the VM, copy the file to the host). This will disable any non-running
kernel modules.

Steps to build/install the kernel:

- make -j 8
- make modules_install INSTALL_MOD_PATH=./install

The above will install the kernel modules to a local subdirectory such
that you can copy them to the VM:

- scp arch/x86/boot/bzImage root@<VM.IPboot/bzImage
- rsync -rv install/lib/modules/x.y.z root@<VM.IPlib/modules/

Create a new initramfs on the VM:

- dracut -H /boot/initramfs.img x.y.z
(x.y.z matches the subdirectory under /lib/modules)
- Edit /boot/grub2/grub.cfg to add a new boot option that points to
  /boot/bzImage and /boot/initramfs-x.y.z (whatever you named the
files). Just copy an existing entry.

... and at that point you should be able to reboot using the alternate
boot option. I wouldn't worry about anything else with regard to the bug
until you get comfortable with that.

Brian

Saturday, December 1, 2012

Changing Directories Quickly in Linux

For the past few months I have been keeping track of the commands I use very frequently using the following command:
awk '{print $1}' ~/.bash_history | sort | uniq -c | sort -n | tail -5
The command 'cd' always shows up in the output. I am a developer, the number of directories I visit most frequently are my git repository  root, 3 more directories inside the repo (the components I work most frequently), logs directory and tests directory for running tests. I wanted to figure out a quick way to jump between these directories.

I found the following post useful in finding the solution to my problem:
http://unix.stackexchange.com/questions/31161/quick-directory-navigation-in-the-terminal

The commands that helped me for this are pushd, dirs -v, the ability to refer to a directory as
'~<num>' where <num> is the number against the directory in 'dirs -v' output.
Idea is to pre-populate the list of directories every-time I start a terminal.
I don't want the order of this list to change, so that over time the number of times I will need to execute 'dirs -v' reduces because I remember the corresponding <num> for these directories.
I saved the list as follows in .bash_dirs file in my home directory.


pushd -n /home/pranithk/.scripts > /dev/null
pushd -n /usr/local/var/log/glusterfs/bricks > /dev/null
pushd -n /usr/local/var/log/glusterfs/ > /dev/null
pushd -n /var/lib/glusterd > /dev/null
pushd -n /home/pranithk/workspace/gluster-tests/afr > /dev/null
pushd -n /home/pranithk/workspace/gerrit-repo/tests/bugs > /dev/null
pushd -n /home/pranithk/workspace/gerrit-repo/libglusterfs/src/ > /dev/null
pushd -n /home/pranithk/workspace/gerrit-repo/xlators/mgmt/glusterd/src/ > /dev/null
pushd -n /home/pranithk/workspace/gerrit-repo/xlators/cluster/afr/src/ > /dev/null
pushd -n /home/pranithk/workspace/gerrit-repo/ > /dev/null

I added the following lines in my .bashrc file to load it every time the terminal is started.

if [ -f ~/.bash_dirs ]; then
        . ~/.bash_dirs
fi

Now every time I start the terminal I have the directories already populated as below:

~
10:22:44 :) ⚡ dirs -v
 0  ~
 1  ~/workspace/gerrit-repo/
 2  ~/workspace/gerrit-repo/xlators/cluster/afr/src/
 3  ~/workspace/gerrit-repo/xlators/mgmt/glusterd/src/
 4  ~/workspace/gerrit-repo/libglusterfs/src/
 5  ~/workspace/gerrit-repo/tests/bugs
 6  ~/workspace/gluster-tests/afr
 7  /var/lib/glusterd
 8  /usr/local/var/log/glusterfs/
 9  /usr/local/var/log/glusterfs/bricks
10  ~/.scripts

Note that the number '0' is always associated with the present directory. The order of the directories is in reverse order of the list we have in the file because pushd keeps recently pushed directory first.

10:26:33 :) ⚡ cd ~8

/usr/local/var/log/glusterfs 
10:26:36 :) ⚡ cd ~1

~/workspace/gerrit-repo (master)
10:26:39 :) ⚡ 

As shown above if I want to change to my logs directory all I need to type is 'cd ~8', if I want to change to my git repository root I type 'cd ~1'. So much typing and excessive pressing of 'tab' are saved :-).

Have fun and enjoy.

Friday, June 15, 2012

2 Regressions in 3days

In the last 3 days 2 really bad regressions I introduced into afr were found :-(. This is not the first time I introduced regressions, I did introduce some before because I did not have the complete picture. I felt that I can reduce them as I learn more about the product. This time it is different, I have seen the second regression in my Unit testing. I assumed it is because I enabled stat-prefetch, disabling it made the test pass, but I did not realize that the test passed because the inode the fop happened belongs to different graph and it does not have split-brain flag. I could have avoided this if I had automated that test case.This made me feel pretty bad. After being frustrated with myself for a day I decided to reduce the probability of introducing regressions in the future.

This is my plan to minimize the regressions in future:
1) Every commit I make from now on is going to have 3 things in the comments:
*) RCA: Description of the reason for the bug.
*) Fix: What changes did  I introduce that would fix the issue.
*) Testcases: url to the test cases/ automated scripts I used for testing.
*) If a test is automatable I should do that without fail, no excuses.

In the code-review my colleagues can point out if I have missed any testcases.
Automation should make sure I don't introduce bugs while porting the fixes to other branches. Lets see how this approach goes.

Sunday, October 16, 2011

Let me start over

Ever since I read the article http://norvig.com/21-days.html by Peter Norvig, I wanted to learn other programming languages. I decided to learn scheme first. I picked up the famous book "Structure and Interpretation of Computer Programs" for scheme. I started it 5 months ago. From tomorrow I am going to start reading the first chapter of the book 3rd time now. The first time I thought I will read the theory on the weekdays and solve the problems on the weekend. Bad idea, If I miss reading the theory for 1 or 2 days in a row, or I did not get to solve the problems on one of the weekends, the next week I would forget what I read completely so I had to start all over again. I became sick of that and stopped it. 1 month back I got 5 day vacation so I started the book again and completed reading the first chapter in the first 3 days and did not get to solve the problems. Now I forgot what I read completely.
               Its been such a long time I read a text book with both theory and problems that I forgot how we used to do those books. We did not read the text books one whole chapter at a time and then solve all the problems of the exercises at once. We read some theory, took notes and solved problems applying that theory the same day for homework, that is what I am going to do this time. I hope this will be fairly regular that it is going to workout. I am going to post the problems solved here as and when I solve them.

Friday, September 30, 2011

Negative testcases

This past week proved me that the way I test my code is not good enough. The QA Engineer who tested my code logged many bugs on the negative test cases. I test mostly positive test cases and the negative test cases the user should see on some errors and the ones I think are *possible*. How do I decide that it is possible? Intuition. The code kept hitting different asserts and crashed because I did not handle those error conditions.
        After looking at all the crashes I changed my way of testing a little bit. I first test the positive test cases, user perceivable error cases. Then either instrument the code or assign error values to the relevant variables in gdb so that the execution will hit the remaining negative test cases  for the functions I modify. 
       This method already gave good result for my recent commit. I found a new bug lingering in the code. It is very difficult to hit that test case with normal testing, may be a good code reviewer could find it out.
         If this model does not scale well, may be I should introduce Fault Injection Points into the code so that the bugs in -ve test cases are caught.

How do you go about testing -ve test cases in your code?.

Sunday, September 25, 2011

learning Software Design

I have been working as software Developer for 4 years. There was very little improvement for the first 2 years in my programming style, code organizing skills, design in general. There were functions spanning thousands of lines. I wrote code with for loop inside a conditional inside a big while loop. There would be multiple duplicate functions iterating over the same data structures just to check different condition. It resulted in multiple similar bugs in different places, generally due to the same bug in all those duplicate functions.
        Of-course there were code reviews, problem was that there were multiple reviewers for different bugs. The tools try to show only the diff for the current fix, so the reviewer used to verify if it fixes that particular bug. After the first year the feature I was working on got so big and delicate, no one knew what new bugs would result in changing a line. It was re-written, but after a few months of the re-write the project started going in similar direction.
        I figured out that the problem was with the way I was writing/testing code. I started reading books on programming, software design etc. Some improvements were easy to implement, like making sure the function length is < 80 lines (language is C). Have tab width as 8 spaces, if any line is going beyond 80 chars then the logic/code needs to be restructured etc. Then came the difficult things like, coupling, cohesion etc. I found it difficult to think about those things at the time of changing the code to accommodate new requirements/fixes. I did not realize that the complexity of the code is increasing.
        I came across the tactical design presentation given by Glenn Vanderburg about an year back. This changed the way I write/think about code significantly. Watching it was one of the valuable hours spent in improving myself.
        He suggests building the right value system while coding. These are the quote from Ralph Johnson about life he shows in one of the slides:
"With the right value system, making good short-term decisions leads to good long-term results".
"I think that is the purpose of a value system. We need to figure out the way to live so that when we are in the middle of life we 'do the right thing.' When our neighbor comes over to argue with us, we are not going to start thinking about how this will affect our lives 10 years from now, but we react according to the way we were taught, and the way we taught ourselves."
         We need to build a value system that makes short-term decisions at the level of functions and lines of code that will lead to good long-term results in terms of overall design.

This is that system:
The two coding rules:
1) Short methods/ functions (80 line function rule)
2) Few methods per class (Have fewer functions with a structure as primary argument).

Three Principles:
1) Do one thing (Helps to construct different basic functions for the operations that can be done on a structure/class)
2) DRY (Don't repeat yourself, Helps in avoiding duplication)
3) Uniform level of Abstraction (Helps in coding at the same level of abstraction)

I will elaborate on how these principles improved my coding skills in the future posts.

PS: I am not very good at English, Let me know if you find any mistakes, I will correct them.

Friday, September 23, 2011

Hello World

Have been postponing this post for an year now. Better late than never. This is the place where I will share my thoughts, things I learn in career and life.