Hello everyone !
I’ve been working on this project for several months and I think I’ve made enough progress to publish it and share it with you in a more official way → TFinder
To give you a quick summary: in the world of biology, we have proteins within our cells that control the expression of genes. These are called transcription factors. These transcription factors cling to DNA and modulate gene expression by binding to regulatory regions (often at the beginning and end of a gene). DNA is made up of 4 nucleotides called A, T, G and C. Depending on how they are arranged they will give different proteins. And transcription factors, on the other hand, recognize a specific pattern of nucleotides. My software then makes it possible to recover the regulatory regions from the name of a gene and then to look for the specific patterns in this region.
First of all, I think it already exists. But even if I looked hard enough, I couldn’t find an application or website that really does it the way I want. Of course, you can do a ctrl+F but it’s always the same. You have to look for all the shapes in all the possible ways. Of course, there are applications that do it (SerialCloner), but once again, there’s something missing.
When you have an idea, you want it to happen fast. Searching for a promoter sequence can be tedious, and database websites aren’t necessarily designed for novices. And that’s where my little script comes in. It extracts the desired gene promoter/terminator region. You can choose the distance upstream and downstream. It is capable of knowing the direction of the gene and proceeding to reverse complement.
All you have to do is search for your responsive elements. No need to ctrl+F, it can do it. It also accepts IUPAC code and finds all possible shapes in all directions, reverse, complement, reverse complement. You can use also JASPAR_ID of transcription factors and also you can generate a Position Weight MAtrix (PWM) with multiple sequence for a more accuracy. And last but not least, it gives you the coordinates of responsive element from the transcription initiation site.
Hope you enjoy