Skip to content

Implement exercise protein-translation #585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion config.json
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,15 @@
"topics": [
]
},
{
"uuid": "13154065-a5dc-4824-92ed-b2fae41c4dd6",
"slug": "protein-translation",
"core": false,
"unlocked_by": null,
"difficulty": 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask about difficulty 1 because I didn't think this exercise is difficulty 1, but given that run-length-encoding is also difficulty 1, I guess I can't complain.

"topics": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any suggestions about what topics this exercise might touch upon?

We have a list of some common topics in https://github.com/exercism/problem-specifications/blob/master/TOPICS.txt - but you are not restricted to the topics in that list. If there are any new Haskell-specific topics or language-features that this exercises uses, those would be a good choice.

]
},
{
"uuid": "197a543c-d9c7-41c3-814c-4b1ece3db568",
"slug": "grains",
Expand Down Expand Up @@ -782,7 +791,7 @@
"uuid": "d5997b60-e54c-4caa-beae-8614f0da3fb3",
"slug": "trinary",
"deprecated": true
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you run configlet fmt . using the latest release of our Configlet tool? It helps us ensure that tracks are configured properly and normalized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove these spaces, as they have nothing to do with adding the exercise

],
"foregone": [

Expand Down
79 changes: 79 additions & 0 deletions exercises/protein-translation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Rna Transcription
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the description for the RNA transcription exercise (https://github.com/exercism/problem-specifications/blob/master/exercises/rna-transcription/description.md), whereas given that this is in the protein-translation directory it needs to be the description of the protein translation exercise: https://github.com/exercism/problem-specifications/blob/master/exercises/protein-translation/description.md


Given a DNA strand, return its RNA complement (per RNA transcription).

Both DNA and RNA strands are a sequence of nucleotides.

The four nucleotides found in DNA are adenine (**A**), cytosine (**C**),
guanine (**G**) and thymine (**T**).

The four nucleotides found in RNA are adenine (**A**), cytosine (**C**),
guanine (**G**) and uracil (**U**).

Given a DNA strand, its transcribed RNA strand is formed by replacing
each nucleotide with its complement:

* `G` -> `C`
* `C` -> `G`
* `T` -> `A`
* `A` -> `U`


## Getting Started

For installation and learning resources, refer to the
[exercism help page](http://exercism.io/languages/haskell).

## Running the tests

To run the test suite, execute the following command:

```bash
stack test
```

#### If you get an error message like this...

```
No .cabal file found in directory
```

You are probably running an old stack version and need
to upgrade it.

#### Otherwise, if you get an error message like this...

```
No compiler found, expected minor version match with...
Try running "stack setup" to install the correct GHC...
```

Just do as it says and it will download and install
the correct compiler version:

```bash
stack setup
```

## Running *GHCi*

If you want to play with your solution in GHCi, just run the command:

```bash
stack ghci
```

## Feedback, Issues, Pull Requests

The [exercism/haskell](https://github.com/exercism/haskell) repository on
GitHub is the home for all of the Haskell exercises.

If you have feedback about an exercise, or want to help implementing a new
one, head over there and create an issue. We'll do our best to help you!

## Source

Rosalind [http://rosalind.info/problems/rna](http://rosalind.info/problems/rna)

## Submitting Incomplete Solutions
It's possible to submit an incomplete solution so you can see how others have completed the exercise.
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: protein-translation

dependencies:
- base

library:
exposed-modules: ProteinTranslation
source-dirs: src
dependencies:
- split

tests:
test:
main: Tests.hs
source-dirs: test
dependencies:
- protein-translation
- hspec
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
module ProteinTranslation (toProtein) where

import Data.List.Split (chunksOf)

toProtein :: String -> [String]
toProtein strand = takeWhile ("STOP" /=) $ map codonToProtein $ chunksOf 3 strand

codonToProtein :: String -> String
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick question, maybe I'm missing something, but to me the description is a little misleading. It seems that the correct solution would be just to do the map operation written in the description, but judging by this and the test cases it seems that it actually expects the protein names instead of the codons?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codonToProtein "AUG" = "Methionine"
codonToProtein "UUU" = "Phenylalanine"
codonToProtein "UUC" = "Phenylalanine"
codonToProtein "UUA" = "Leucine"
codonToProtein "UUG" = "Leucine"
codonToProtein "UCU" = "Serine"
codonToProtein "UCC" = "Serine"
codonToProtein "UCA" = "Serine"
codonToProtein "UCG" = "Serine"
codonToProtein "UAU" = "Tyrosine"
codonToProtein "UAC" = "Tyrosine"
codonToProtein "UGU" = "Cysteine"
codonToProtein "UGC" = "Cysteine"
codonToProtein "UGG" = "Tryptophan"
codonToProtein "UAA" = "STOP"
codonToProtein "UAG" = "STOP"
codonToProtein "UGA" = "STOP"
codonToProtein _ = error "Invalid codon."
20 changes: 20 additions & 0 deletions exercises/protein-translation/package.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: protein-translation
version: 1.0.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that https://github.com/exercism/problem-specifications/tree/master/exercises/protein-translation/canonical-data.json does not exist (that link will lead to a 404), the version number that is in accordance with the versioning policy (#522) will be 0.1.0.1 rather than 1.0.0.0


dependencies:
- base

library:
exposed-modules: ProteinTranslation
source-dirs: src
dependencies:
# - foo # List here the packages you
# - bar # want to use in your solution.

tests:
test:
main: Tests.hs
source-dirs: test
dependencies:
- protein-translation
- hspec
4 changes: 4 additions & 0 deletions exercises/protein-translation/src/ProteinTranslation.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
module ProteinTranslation (toProtein) where

toProtein :: String -> [String]
toProtein strand = error "You need to implement this function"
1 change: 1 addition & 0 deletions exercises/protein-translation/stack.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
resolver: lts-8.12
95 changes: 95 additions & 0 deletions exercises/protein-translation/test/Tests.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
{-# LANGUAGE RecordWildCards #-}

import Data.Foldable (for_)
import Test.Hspec (Spec, describe, it, shouldBe)
import Test.Hspec.Runner (configFastFail, defaultConfig, hspecWith)

import ProteinTranslation (toProtein)

main :: IO ()
main = hspecWith defaultConfig {configFastFail = True} specs

specs :: Spec
specs = describe "toProtein" $ for_ cases test
where
test Case{..} = it description $ toProtein strand `shouldBe` expected

data Case = Case { description :: String
, strand :: String
, expected :: [String]
}

cases :: [Case]
cases = [ Case { description = "identifies methionine codon"
, strand = "AUG"
, expected = ["Methionine"]
}
, Case { description = "identifies phenylalanine codon (UUU)"
, strand = "UUU"
, expected = ["Phenylalanine"]
}
, Case { description = "identifies phenylalanine codon (UUC)"
, strand = "UUC"
, expected = ["Phenylalanine"]
}
, Case { description = "identifies leucine codon (UUA)"
, strand = "UUA"
, expected = ["Leucine"]
}
, Case { description = "identifies leucine codon (UUG)"
, strand = "UUG"
, expected = ["Leucine"]
}
, Case { description = "identifies leucine codon (UUG)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this case any different from the one immediately above? I see two "identifies leucine codon (UUG)"

, strand = "UUG"
, expected = ["Leucine"]
}
, Case { description = "identifies serine codon (UCU)"
, strand = "UCU"
, expected = ["Serine"]
}
, Case { description = "identifies serine codon (UCC)"
, strand = "UCC"
, expected = ["Serine"]
}
, Case { description = "identifies serine codon (UCA)"
, strand = "UCA"
, expected = ["Serine"]
}
, Case { description = "identifies serine codon (UCG)"
, strand = "UCG"
, expected = ["Serine"]
}
, Case { description = "identifies tyrosine codon (UAU)"
, strand = "UAU"
, expected = ["Tyrosine"]
}
, Case { description = "identifies tyrosine codon (UAC)"
, strand = "UAC"
, expected = ["Tyrosine"]
}
, Case { description = "identifies cysteine codon (UGU)"
, strand = "UGU"
, expected = ["Cysteine"]
}
, Case { description = "identifies cysteine codon (UGC)"
, strand = "UGC"
, expected = ["Cysteine"]
}
, Case { description = "identifies tryptophan codon"
, strand = "UGG"
, expected = ["Tryptophan"]
}
, Case { description = "translate RNA strand into correct protein"
, strand = "AUGUUUUGG"
, expected = ["Methionine", "Phenylalanine", "Tryptophan"]
}
, Case { description = "stops translation if the STOP codon is present"
, strand = "AUGUUUUAA"
, expected = ["Methionine", "Phenylalanine"]
}
, Case { description = "stops translation of longest strand"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what "longest" means in this context. Can you clarify? I know that it is the longest strand used in the test cases so far, but I don't understand why that must be specified in the description. To me the important part of this test case is that even if there are codons after STOP, they are not included in the output. Shouldn't that be the description?

, strand = "UGGUGUUAUUAAUGGUUU"
, expected = ["Tryptophan", "Cysteine", "Tyrosine"]
}
]