Skip to content

Commit 5cc528e

Browse files
Link to Markdown pages by default, and correct to HTML in the HTML pages.
This way, the links are easy to follow when browsing HTML locally or Markdown on GitHub.
1 parent 5715ae5 commit 5cc528e

File tree

9 files changed

+47
-6
lines changed

9 files changed

+47
-6
lines changed

CAGE_differential_analysis1/Makefile

+1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
all: analysis.Rmd
22
/usr/bin/Rscript -e "knitr::knit2html('analysis.Rmd')"
3+
sed -i 's/\.md/\.html/g' analysis.html
34

45
clean:
56
$(RM) -r cache *.fastq *.bam *.fq *.fastq.bz2 *.id *.log tagdust.fa

CAGE_differential_analysis2/Makefile

+1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
all: analysis.Rmd
22
/usr/bin/Rscript -e "knitr::knit2html('analysis.Rmd')"
3+
sed -i 's/\.md/\.html/g' analysis.html
34

45
clean:
56
$(RM) -r cache *.fastq *.bam *.fq *.fastq.bz2 *.id *.log tagdust.fa

CAGE_differential_analysis2/analysis.html

+2-2
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ <h1>Digital expression comparison between nanoCAGE libraries testing variant tem
189189
made with different template-switching oligonucleotides. It is intended as an
190190
example on how to compare shallow-sequenced nanoCAGE libraries.</p>
191191

192-
<p>See the main <a href="../README.md">README</a> for general recommendations on how or what
192+
<p>See the main <a href="../README.html">README</a> for general recommendations on how or what
193193
to prepare before running this tutorial.</p>
194194

195195
<h2>Table of contents</h2>
@@ -306,7 +306,7 @@ <h2><a name="artifact-cleaning-and-alignment">Artifact cleaning and alignment of
306306
<h3>Removal of artifacts with TagDust</h3>
307307

308308
<p>Download <a href="http://bioinformatics.oxfordjournals.org/content/25/21/2839">TagDust</a>
309-
and install it in the user&#39;s path. See the main <a href="../README.md">README</a> for details.</p>
309+
and install it in the user&#39;s path. See the main <a href="../README.html">README</a> for details.</p>
310310

311311
<pre><code class="bash">cat &gt; tagdust.fa &lt;&lt;__TagDust__
312312
&gt;TS (before barcode)
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
all: subsampling.Rmd
22
/usr/bin/Rscript -e "knitr::knit2html('subsampling.Rmd')"
3+
sed -i 's/\.md/\.html/g' subsampling.html
34

45
clean:
56
$(RM) -r cache

CAGE_normalisation_by_subsampling/subsampling.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ The name of the libraries are long because they contain a plain English
6666
description of their contents. We will shorten them to their identifier. For
6767
example, `counts.Adipocyte%20-%20breast%2c%20donor1.CNhs11051.11376-118A8`
6868
becomes `CNhs11051`. The association can be re-made using [FANTOM5 SDRF
69-
files](../FANTOM5_SDRF_files/).
69+
files](../FANTOM5_SDRF_files/sdrf.md).
7070

7171
```{r column_names, dependson="load_data"}
7272
colnames(osc) <- regmatches(colnames(osc), regexpr('CNhs.....', colnames(osc)))

CAGE_normalisation_by_subsampling/subsampling.html

+2-2
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ <h1>Normalisation of CAGE data by sub-sampling</h1>
206206
related to the analysis presented in the supplementary note number 4 of the
207207
FANTOM 5 paper (<a href="http://dx.doi.org/10.1038/nature13182" title="Forrest et al., 2014">Forrest <em>et al.</em>, 2014</a>)</p>
208208

209-
<p>See the main <a href="../README.md">README</a> for general recommendations on how or what
209+
<p>See the main <a href="../README.html">README</a> for general recommendations on how or what
210210
to prepare before running this tutorial.</p>
211211

212212
<p>Busy people familiar with CAGE and R can skip the tutorial and read the manual
@@ -245,7 +245,7 @@ <h3>Data loading and preparation in R</h3>
245245
<p>The name of the libraries are long because they contain a plain English
246246
description of their contents. We will shorten them to their identifier. For
247247
example, <code>counts.Adipocyte%20-%20breast%2c%20donor1.CNhs11051.11376-118A8</code>
248-
becomes <code>CNhs11051</code>. The association can be re-made using <a href="../FANTOM5_SDRF_files/">FANTOM5 SDRF
248+
becomes <code>CNhs11051</code>. The association can be re-made using <a href="../FANTOM5_SDRF_files/sdrf.html">FANTOM5 SDRF
249249
files</a>.</p>
250250

251251
<pre><code class="r">colnames(osc) &lt;- regmatches(colnames(osc), regexpr(&#39;CNhs.....&#39;, colnames(osc)))

CAGE_normalisation_by_subsampling/subsampling.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ The name of the libraries are long because they contain a plain English
7373
description of their contents. We will shorten them to their identifier. For
7474
example, `counts.Adipocyte%20-%20breast%2c%20donor1.CNhs11051.11376-118A8`
7575
becomes `CNhs11051`. The association can be re-made using [FANTOM5 SDRF
76-
files](../FANTOM5_SDRF_files/).
76+
files](../FANTOM5_SDRF_files/sdrf.md).
7777

7878

7979
```r

Makefile

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
all: README.md
2+
pandoc README.md > README.html
3+
sed -i 's/\.md/\.html/g' README.html

README.html

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
<h1 id="tutorials-for-analysing-cage-and-deep-race-data.">Tutorials for analysing CAGE and Deep-RACE data.</h1>
2+
<p>Various tutorials on how to analyse <a href="https://en.wikipedia.org/wiki/Cap_analysis_gene_expression">CAGE</a> data.</p>
3+
<ul>
4+
<li><a href="./Deep-RACE1/Deep-RACE1.html">Deep-RACE</a> (work in progress)</li>
5+
<li><a href="./CAGE_differential_analysis1/analysis.html">CAGE differential analysis 1</a> (work in progress)</li>
6+
<li><a href="./CAGE_differential_analysis2/analysis.html">CAGE differential analysis 2</a></li>
7+
<li><a href="./FANTOM5_SDRF_files/sdrf.html">Simple use of FANTOM5 SDRF files</a></li>
8+
</ul>
9+
<p>These tutorials are designed to be executed on a Linux system's command line interface (also called <em>Terminal</em> or <em>shell</em>). I recommend the book <em><a href="http://linuxcommand.org/tlcl.php" title="A Complete Introduction">The Linux Command Line</a></em>, by William E. Shotts, Jr, January 2012, <a href="http://nostarch.com/tlcl.htm" title="the finest in geek entertainment">no starch press</a> to people not familiar with entering commands on the keyboard.</p>
10+
<p>The programs used are assumed to be installed in advance. On the <a href="http://www.debian.org">Debian</a> operating system, many of them (BWA, SAMtools, BEDTools, ...) are available pre-packaged and will be installed (altogether with many other programs) by the command <code>apt-get install med-bio</code>.</p>
11+
<p>Other software have to be downloaded and installed by hand. Place them in the <code>bin</code> directory in your home directory, and set their executable property in order to use them. If you had to create the <code>bin</code> directory, it will only be taken into account at your next connection (see <a href="http://stackoverflow.com/questions/16366986/adding-bin-directory-in-your-path">stackoverflow</a> for alternatives).</p>
12+
<p>Here is for example how to download, compile and install the <a href="http://genome.gsc.riken.jp/osc/english/software/src/tagdust.tgz">tagdust</a> software. By convention, we will download the software in a directory called <code>src</code>. <em>Compiling</em> means to produce the executable program suitable for your computer, using the <a href="https://en.wikipedia.org/wiki/Source_code">source code</a> that was downloaded. On Debian systems, the programs necessary for compiling a program made in the C programming language can be installed through the <code>build-essential</code> package.</p>
13+
<pre><code>cd # move back to the home directory
14+
mkdir -p src # create the src directory if it did not exist.
15+
cd src # enter the src directory
16+
wget http://genome.gsc.riken.jp/osc/english/software/src/tagdust.tgz # download TagDust
17+
tar xvf tagdust.tgz # unpack TagDust
18+
cd tagdust # enter the freshly tagdust directory created by TagDust
19+
make # compile the program
20+
cp tagdust ~/bin # copy tagdust to the &#39;bin&#39; directory in your home directory</code></pre>
21+
<h2 id="frequent-problems">Frequent problems</h2>
22+
<h3 id="command-not-found.">Command not found.</h3>
23+
<p>It is not enough to compile a program. The command-line interface needs to find them, and by default it does not search in the current work directory.</p>
24+
<p>A very good explanation is in <em><a href="http://linuxcommand.org/tlcl.php" title="A Complete Introduction">The Linux Command Line</a></em>'s chapter 24, section <em>Script File Location</em>. Here is a brief summary.</p>
25+
<p>The standard way to make programs accessible is to add them to one of a set of pre-defined directories that are collectively called the <em>PATH</em>. For system-wide installations, the directory is usually <code>/usr/bin</code>. For local installations by a single user, the directory is usually called <code>bin</code>, in the <em>home</em> directory, also accessible via the shortcut <code>~/bin</code>. If it does not exist, it can be created like any other directory, but it may be necessary to log out and in again in order for the system to recognise this directory in the <em>PATH</em>.</p>
26+
<p>In addition, the program needs to have the executable permissions. These can be given with the <code>chmod</code> command (see <em><a href="http://linuxcommand.org/tlcl.php" title="A Complete Introduction">The Linux Command Line</a></em>'s chapter 24, section <em>Executable Permissions</em>.), or via the file navigator of the desktop graphical interface.</p>
27+
<p>Lastly, it is possible to run a program that is not in the <em>PATH</em>. For this, just indicate in which directory it is. The current directory is always aliased to <code>.</code>, so to run a program called <code>myscript</code> that is in the current directory, type <code>./myscript</code>. (The comment above about executable permissions still applies).</p>
28+
<h3 id="what-is-that-sponge">What is that sponge ?</h3>
29+
<p><code>sponge</code> is a command from the <a href="http://joeyh.name/code/moreutils/">moreutils</a> collection, that I use frequently. On Debian systems, it is easy to install via the <a href="packages.debian.org/moreutils">moreutils</a> package.</p>
30+
<p>The goal of <code>sponge</code> is to solve the following problem: when one file is read, piped to a command, and the result is redirected to the file itself, the contents are not updated as expected, but the file is deleted. This is because at the very beginning of the command, the file receiving the redirection is transformed in an empty file before its contents are even read. For example, with a file called <code>example.fq</code>:</p>
31+
<pre><code>cat example.fq | fastx_trimmer -f 11 &gt; example.fq # Deletes the file.
32+
cat example.fq | fastx_trimmer -f 11 | sponge example.fq # Trims the first 10 nucleotides.</code></pre>
33+
<p>Without <code>sponge</code>, one would need to create a temporary file (which is actually what <code>sponge</code> does in a more proper way behind the scene).</p>
34+
<pre><code>cat example.fq | fastx_trimmer -f 11 &gt; example.tmp.fq
35+
mv example.tmp.fq example.fq</code></pre>

0 commit comments

Comments
 (0)