-
Notifications
You must be signed in to change notification settings - Fork 7
Session 5: Practice 0

- Time: 2h
- Date: Wednesday, Feb-12th-2020
-
Goals:
- Practicing with real dna sequences downloaded from the Ensembl database
- Introduction
- Working in separate files
- Installing the tools on your Laptop
- Exercises
- End of the session
- Author
- Credits
- License
The goal of this practice is to create a library of functions for working with DNA sequences. We will call this initial library as Seq0
These are all the functions that should be implemented. They all will be stored in the file Seq0.py
| Function name | Parameters | Return value | Description |
|---|---|---|---|
| seq_ping() | none | none | Test function. It just prints the "OK" message on the console |
| seq_read_fasta(filename) | filename: string | String | Open a DNA file in FASTA format and return the sequence as a string (It should only contains the characters 'A', 'T', 'G' or 'C |
| seq_count_base(seq, base) | seq:String; base: character | Integer | Calculate the number of the given base in the Sequence |
| seq_count(seq) | seq: String | A dicctionary | Calculate the number of all the bases in the sequence. A dicctionary with the results is returned. The keys are the bases and the values their number |
| seq_len(seq) | seq: String | integer | Calculate the total number of bases in the sequence |
| seq_reverse(seq) | seq: String | String | Return the reverse sequence |
| seq_complement(seq) | seq: String | String | Return the complement sequence |
All the programs we have created so far have been located in their own python files: one file per program. In that file we placed both the main code and the functions
When working in more complex projects, the python code is divided in small pieces of code and placed in different files. In this practice we will place all the functions for working with DNA sequences into the Seq0.py file, and we will create additional files for the application programs that will use that library
For accessing to the functions defined in the Seq0 module, you should write this line in the beginning of your programs:
from Seq0 import *In addition, as the Seq0 module will be located in the P0 Folder, you have to tell it to Pycharm by right clicking on the P0 folder and selecting the Mark directory as/Sources Root
- Make sure you have the P0 folder

- Right click on it and select Mark directory as/Sources Root

- Notice how the P0 folder now have a different color

From practice 0, you can use your own laptop for working if you want. What is important is that you push all your code into your remote repository at Github. Remember: the more commits, the better

Some exercises will be mandatory to test them in the lab, because we will communicate with some computers in the lab network. But many others, including the final project, could be done in your own laptop
All the tools we are using are Open Source and Multiplatform: You can install them on Linux, Mac or Windows

You will need to install these three tools: Python (3.6 or higher), Git and Pycharm Community Edition
Very likely you will have python (3.6 or higher) already installed. For checking that, open a terminal in your computer (in windows it is called "Símbolo del sistema") and type python. You should see how the python interpreter is opened. The exact output depends on your operating system and python version. This is what I can see in my laptop with Ubuntu/Linux 18.04:
$ python
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
If you cannot open the python console, install it from here: Python installation. Download the installer and execute them. WINDOWS users: It is very important that you check the option Add python 3.x to Path

Git is the tool to allow us to have a local repository in our computer and connect it to our remote repository located in Github
- Linux users: open a terminal and write this command:
sudo apt install git
-
Mac/Windows users: You have to manually install git from here: Install Git
- Download the Installer
- Run the installer and chose the default options (click on next, next, next....)
Once installed, check that it is working correctly. Open a Terminal an type git --version. You will see something like this:
$ git --version
git version 2.17.1
(Your version may be different. It does not matter)
You have to download the installer from this page: Pycharm Community Edition. Execute the installer and chose the default options
Ubuntu/Linux users, you install it directly from the Ubuntu Store

Once everything is installed, you can proceed with the instrucctions given in the Session 2: Working with Pycharm and Github
I will help you during the Lab sessions (and after the session if you want to stay). Remember: There are many of you, but I am only one, so be patience :-) If I do not have time to help you in this session, wait for the next (or after the class). In the meanwhile you can work in the lab
Practice 0 will be guided. Every exercise will teach you something. The goal is to have a working Seq0 module for processing DNA sequences and to test it with some example programs. In the final exercises you will have to write some programs for answering the questions
The seq_ping() functions is just for testing. It prints the "OK" string on the console
- Filename: P0/Seq0.py
- Description: This file is the library for the DNA functions. In this first exercise you should write in there the seq_ping() function
- Filename: P0/Ex1.py
- Description: This is the main program. It should import the module Seq0. When this program is executed, you should see this on the console:
Testing the seq_ping() funcion
OK
Process finished with exit code 0
Congrats! You have created your first python module!
Implement the seq_read_fasta(filename) function. It should open a file, in FASTA format, and return a String with the DNA sequence. The head is removed, as well as the '\n' characters. This function should be written in the Seq0.py file
- Filename: P0/Ex2.py
- Description: Write a python program for opening the U5.txt file and writing into the console the first 20 bases of the sequence
- Output: This is what should be printed on the Console:
DNA file: U5.txt
The first 20 bases are:
ATAGACCAAACATGAGAGGC
Process finished with exit code 0
-
Considerations:
- Have in mind that the DNA files are stored in the Session-04 folder and you are working now on the P0 folder. Create a constant with the path to the file:
FOLDER = "../Session-04/"- Then another for the DNA filename:
FILENAME = "U5.txt"- The full filename passed to the seq_read_fasta() should be the string FOLDER + FILENAME
Implement the seq_len(seq) function, that calculates the total number of bases in the sequence. It should be written in the Seq0.py file
- Filename: P0/Ex3.py
- Desription: Write a python program for calculating the total length of the 5 Genes: U5, ADA, FRAT1, FXN and U5. The program should call the seq_len() function
- Output: This is what should be seen on the console after the execution:
-----| Exercise 3 |------
Gene U5 ---> Length: 1314
Gene ADA ---> Length: 33912
Gene FRAT1 ---> Length: 3845
Gene FXN ---> Length: 25615
Gene U5 ---> Length: 1314
Process finished with exit code 0
-
Considerations:
- Create a list with the names of the Genes
- The filename can be obtained by adding the ".txt" string to the gene name
- Use a for loop for iterating over the 5 genes and calculating their lengths
The session is finished. Make sure, during this week, that everything in this list is checked!
- You have all the items of the session 4 checked!
- Your working repo contains the P0 Folder with the following files:
- TODO
- All the previous files have been pushed to your remote Github repo
- Juan González-Gómez (Obijuan)

- Alvaro del Castillo. He designed and created the original content of this subject. Thanks a lot :-)

S0: Introduction
S1: Tools I
S2: Tools II
S3: Practicing with the tools
S8: Client-Server-1
S9: Client-Server-2
S10: Client-server-3
S11: Client-server-4
S12: HTTP protocol-1
S13: HTTP protocol-2
S14: HTTP module
S15: HTTP module
S16: HTML forms
S17: HTML forms