CSCI 241 Labs: Lab 12
A DNA Class


There are 5 checkpoints , including the clean-up checkpoint, in this lab. You and your partner should work together using just one of your accounts. CHANGE WHO IS CONTROLLING THE COMPUTER AFTER EACH CHECKPOINT! Make sure that each partner understands all steps in the lab. If you need help with any exercise, raise your hand.

Copy the lab materials to your account from /home/student/Classes/Cs241/Labs/Lab12

Change directories into Lab12 and start running BlueJ.

 

A DNA Class

In this lab, you and your partner will develop a class to model strands of DNA. You will finish and create various methods that are useful in such a class. Keep your prelab reading handy; you may need to refer to it to complete some of the exercises.

Open the Bioinformatics project inside of the Lab12 directory. This project already contains a shell (partially-written) class named DNA that represents DNA objects. You can compile it and create DNA objects on the object bar without making any changes to the class. Go ahead and try it! (Remember that when you run the constructor and enter a String, BlueJ requires that you surround that String with double quote marks.)

The class contains only one data member: a String named sequence. This String contains the characters representing the nucleotides in the DNA. The class contains one constructor which takes a single String parameter representing the nucleotides needed to initialize the DNA object. The constructor is currently just a stub.

Confer with your partner and write the following code inside the constructor to complete it:

  1. Convert the constructor's parameter to uppercase and assign the result to sequence.
  2. Validate that the letters in sequence are all valid nucleotides. There is another method called validate(), also currently a stub, located just below the constructor. Validation of input is typically done in a separate method like this. validate() is a private method. Complete that method, and call validate() in the last line of the constructor.

    There are two different (but equally correct) approaches to validating the DNA sequence.

    1. Add a for loop to walk through sequence one character at a time. If the character is not 'A', 'C', 'G' or 'T', validate() should print an error message to System.out. Continue the loop until all invalid characters are found and printed.

      or

    2. Use pattern matching to make certain that the sequence contains 1 or more instances of 'A', 'C', 'G' or 'T' and no other characters. Using this approach reduces the code to only a couple of lines of code.

    Complete the DNA class constructor and validate() method, and test them with both good and bad data.

1 Show us your revised constructor, and how it runs. Be ready to answer the following questions:

  1. Why did we direct you to make sequence all upper case?
  2. Why is validate() private?
  3. Which approach did you use to validate the sequence? Why?

 

toRNA

One important step in creating proteins is converting DNA to RNA. This requires replacing all the Ts in the sequence with Us. The String class has a method named replaceAll(). Use it to make the substitutions. The method should return the new String.

The toRNA() method is already in the class as a stub.

Complete the toRNA() method and test it thoroughly.

2 Show us your revised toRNA() method, and your tests. Be ready to answer the following questions:

  1. Why don't we need to write a special validate() method for the RNA sequence?
  2. Would it be a better idea to create an RNA class and have this method return an RNA object, rather than a String?

 

Reverse Complement

As discussed in the prelab reading, DNA comes in two strands. Given one strand of DNA it is possible to completely recreate the other strand. This is done by swapping As with Ts and Cs with Gs and then putting the sequence in reverse order. This means, for example, that each original T will be replaced with an A AND each original A will be replaced with a T. Work with your partner to create pseudocode for such a method. Decide whether you want to do everything in one loop, or as two separate, sequential loops. Then, add a new method reverseComplement() to your DNA class. It should take no parameters and return a String object that contains the reverse complement of sequence. Thoroughly test your method.

3 Show us your reverseComplement() method, and your tests. Be ready to answer the following questions:

  1. Did you decide to use any loops? Why or why not?
  2. Could you use a switch statement to do the character substitutions? Why or why not?
  3. Why is it important to be able to generate the reverse complement of a sequence? (Hint: Refer to your prelab reading.)

 

Making Real World DNA Objects

The second class in the Bioinformatics project is named Demo. It contains a main method that reads DNA sequence data from a file and creates a DNA object. This data is from a blue whale. Complete the main method by adding lines to do the following:
  1. Print the sequence of the DNA to System.out.
  2. Print the corresponding RNA sequence.
  3. Print the reverse complement of this sequence.

4 Show us your main() method. Be ready to answer the following questions:

  1. About how many characters are in the sequence?
  2. This is still just a tiny fragment of DNA. In larger data sets there may be hundreds of thousands or millions of nucleotides in the sequence. What additional methods would you add to your DNA class to help make this data easier to interpret?

 

After the Lab

Remember to exit Firefox before you log out.


5 Show us that you have logged out, turned off your monitor, cleaned up, and pushed in your chairs for this last checkpoint.