Stout Self (lockhot19)
We divided the original dataset into two annotation tasks Task 1, 70% of the dataset annotated by one worker, and Task 2, 30% of the dataset annotated by seven workers. Also, for Task 2, we added an extra rater on-site and a domain expert to further assess the crowdsourcing validation quality. Here, we describe a detailed pipeline for RE crowdsourcing validation, creating a new release of the PGR dataset with partial domain expert revision, and assess the quality of the MTurk platform. We applied the new dataset to two state-of-the-art deep learning systems (BiOnt and BioBERT) and compared its performance with the original PGR dataset, as well as combinations between the two, achieving a 0.3494 increase in average F-measure. The code supporting our work and the new release of the PGR dataset is available at https//github.com/lasigeBioTM/PGR-crowd.We present RegulomePA, a database that contains biological information on regulatory interactions between transcription factors (TFs), sigma factor (SFs) and target genes in Pseudomonas aeruginosa PAO1. RegulomePA consists of 4827 regulatory interactions between 2831 nodes, which represent the interactions of TFs and SFs with their target genes, from the total of predicted RegulomePA including 27.27% of the TFs, 54.16% of SFs and 50.8% of the total genes. Each entry in the database corresponds to one node in the network and provides comprehensive details about the gene and its regulatory interactions such as gene description, nucleotide sequence, genome-strand position and links to other databases as well as the type of regulation it exerts or to which it is being subject (repression or activation), the associated experimental evidence and references, and topological information. Additionally, RegulomePA provides a way to recover information on the regulatory circuits of the network to which a gene pertains and also makes available the source codes to analyze the topology of any other regulatory network. The database will be updated yearly, by our team, with the contributions from ourselves and users, since the users are provided with an interactive platform where they can add interactions to the regulatory network feeding it with their respective references. Database URL starch properties can affect end product quality in many ways, rice starch from Thai domesticated cultivars and landraces has been the focus of increasing research interest. Increasing knowledge in this area creates a high demand from the research community for better organized information. The Thai Rice Starch Database (ThRSDB) is an online database containing data extensively curated from original research articles on Thai rice starch composition, molecular structure and functionality. The key aim of the ThRSDB is to facilitate accessibility to dispersed rice starch information for, but not limited to, both research and industrial users. Currently, 373 samples from 191 different Thai rice cultivars have been collected from 39 published articles. The ThRSDB includes the search functions necessary for accessing data together with a user-friendly web interface and interactive visualization tools. We have also demonstrated how the collected data can be efficiently used to observe the relationships between starch parameters and rice cultivars through correlation analysis and Partial Least Squares Discriminant Analysis. Database URL http//thairicestarch.kku.ac.th.Photosystem II (PSII) is a large membrane protein complex performing primary charge separation in oxygenic photosynthesis. The biogenesis of PSII is a complicated process that involves a coordinated linking of assembly modules in a precise order. Each such module consists of one large chlorophyll (Chl)-binding protein, number of small membrane polypeptides, pigments and other cofactors. We isolated the CP47 antenna module from the cyanobacterium Synechocystis sp. PCC 6803 and found that it contains a 11-kDa protein encoded by the s