Turkiye Klinikleri Journal of Biostatistics

.: ORIGINAL RESEARCH
Classification of RNA-Sequencing Data Via Poisson and Negative Binomial Linear Discriminant Analyses: A Methodological Study
RNA-Dizileme Verilerinin Poisson ve Negatif Binom Doğrusal Ayırma Analizleri ile Sınıflandırılması: Metodolojik Bir Çalışma
Dinçer GÖKSÜLÜKa, Ahmet Ergün KARAAĞAOĞLUb
aDepartment of Biostatistics, Erciyes University Faculty of Medicine, Kayseri, Türkiye
bDepartment of Biostatistics, Lokman Hekim University Faculty of Medicine, Ankara, Türkiye
Turkiye Klinikleri J Biostat. 2023;15(3):150-60
doi: 10.5336/biostatic.2023-98597
Article Language: EN
Full Text
ABSTRACT
Objective: Microarray and RNA sequencing (RNASeq) technologies are frequently employed in genetic data analysis for detecting disease-associated genes, identifying cancer subtypes, and enabling molecular diagnosis. While numerous methods have been proposed for classification problems using microarray data, there is a paucity of developed methods for classifying RNA-Seq data. This study aims to compare the performance of novel methods developed for RNA-Seq data on 3 distinct real-life datasets. Material and Methods: Cervical cancer, Alzheimer's disease, and kidney cancer RNA-Seq data were utilized in this study. The data were divided into training and test sets in a %70 and %30 ratio, respectively. Various preprocessing steps, such as normalization, power transformation, and variance filtering, were applied to the data. The Poisson Linear Discriminant Analysis (PLDA) and Negative Binomial Linear Discriminant Analysis (NBLDA) models were used for classification purposes, and the predictive performances of these models were compared. Results: Among the three datasets, the Alzheimer's data exhibited the lowest level of dispersion, while the cervical cancer data had the highest overdispersion. The NBLDA model demonstrated superior classification performance compared to the PLDA model. In cases of mild-to-moderate overdispersion, the predictive performance of the PLDA model improved when power transformation was applied, resulting in performance similar to that of the NBLDA model. Conclusion: PLDA and NBLDA models are two novel and promising techniques used in classifying RNA-Seq data. The performance of these models is influenced by the degree of overdispersion. In cases of high overdispersion, it is recommended to utilize the NBLDA model.

Keywords: Genomics; RNA-Sequencing; PLDA; NBLDA; classification
ÖZET
Amaç: Mikrodizi ve RNA dizileme teknolojileri, genetik çalışmalarda hastalıkla ilişkili genlerin tespiti, kanser alt tiplerinin belirlenmesi, moleküler teşhis gibi amaçlar için sıklıkla kullanılan yöntemlerdir. Mikrodizi verilerinde sınıflama problemleri için literatürde birçok yöntem önerilmiştir. Bununla birlikte RNA dizileme verilerinde sınıflama problemleri için sınırlı sayıda yöntem bulunmaktadır. Bu çalışma, RNA dizileme verileri için geliştirilen yeni yöntemlerin performansını 3 farklı gerçek veri seti üzerinde karşılaştırmayı amaçlamaktadır. Gereç ve Yöntemler: Bu çalışmada, serviks kanseri, Alzheimer hastalığı ve böbrek kanseri RNA dizileme verileri kullanılmıştır. Veriler, sırasıyla %70 ve %30 oranında eğitim ve test kümelerine ayrılmıştır. Normalizasyon, güç dönüşümü ve varyans filtreleme gibi çeşitli ön işlemlerden sonra veriler, Poisson Doğrusal Ayırma Analizi (PDAA) ve Negatif Binom Doğrusal Ayırma Analizi (NBDAA) modelleri kullanılarak modellenmiş ve modellerin tahmin performansları karşılaştırılmıştır. Bulgular: Üç veri seti arasında Alzheimer verisi en düşük, serviks kanseri verisi ise en yüksek aşırı dağılıma sahipti. NBDAA modeli, PDAA modeline göre daha iyi sınıflandırma performansı göstermiştir. Hafif-orta derecede aşırı dağılım gözlendiği durumlarda, PDAA modelinin tahmin performansı güç dönüşümü uygulandığında iyileşmiş ve NBDAA ile benzer performans elde edilmiştir. Sonuç: PDAA ve NBDAA modelleri, RNA dizileme verilerinin sınıflandırılmasında kullanılan yeni ve umut verici tekniklerdir. Bu modellerin performansı, veri setindeki aşırı yaygınlığın derecesinden etkilenmektedir. Veride yüksek aşırı yaygınlık olması durumunda NBDAA modelinin kullanılması önerilmektedir.

Anahtar Kelimeler: Genomik; RNA-dizileme; Poisson Doğrusal Ayırma Analizi; Negatif Binom Doğrusal Ayırma Analizi; sınıflama
REFERENCES:
  1. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. [Crossref]  [PubMed]  [PMC] 
  2. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531-7. [Crossref]  [PubMed] 
  3. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18(9):1509-17. [Crossref]  [PubMed]  [PMC] 
  4. Witten DM. Classification and clustering of sequencing data using a Poisson model. Annals of Applied Statistics. 2011;5(4):2493-518.[Crossref] 
  5. Dudoit D, Fridlyand J, Speed TP. Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association. 2002;97(457):77-87. [Crossref] 
  6. Goksuluk D, Zararsiz G, Korkmaz S, Eldem V, Zararsiz GE, Ozcetin E, et al. MLSeq: Machine learning interface for RNA-sequencing data. Comput Methods Programs Biomed. 2019;175:223-31.[Crossref]  [PubMed] 
  7. Dong K, Zhao H, Tong T, Wan X. NBLDA: negative binomial linear discriminant analysis for RNA-Seq data. BMC Bioinformatics. 2016;17(1):369. [Crossref]  [PubMed]  [PMC] 
  8. Zararsiz G, Goksuluk D, Korkmaz S, Eldem V, Zararsiz GE, Duru IP, et al. A comprehensive simulation study on classification of RNA-Seq data. PLoS One. 2017;12(8):e0182507.[Crossref]  [PubMed]  [PMC] 
  9. Law CW, Chen Y, Shi W, Smyth GK. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15(2):R29. [Crossref]  [PubMed]  [PMC] 
  10. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139-40. [Crossref]  [PubMed]  [PMC] 
  11. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. [Crossref]  [PubMed]  [PMC] 
  12. Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics. 2010;11:94. [Crossref]  [PubMed]  [PMC] 
  13. Witten D, Tibshirani R, Gu SG, Fire A, Lui WO. Ultra-high throughput sequencing-based small RNA discovery and discrete statistical biomarker analysis in a collection of cervical tumours and matched controls. BMC Biol. 2010;8:58. [Crossref]  [PubMed]  [PMC] 
  14. Leidinger P, Backes C, Deutscher S, Schmitt K, Mueller SC, Frese K, et al. A blood based 12-miRNA signature of Alzheimer disease patients. Genome Biol. 2013;14(7):R78. [Crossref]  [PubMed]  [PMC] 
  15. Saleem M, Padmanabhuni SS, Ngomo AN, Almeida JS, Decker S, Deus HF. Linked cancer genome atlas database. Proceedings of the 9th International Conference on Semantic Systems, I-SEMANTICS'13. New York, NY, USA. 2013. p.129-34. [Crossref] 
  16. Cloonan N, Forrest AR, Kolle G, Gardiner BB, Faulkner GJ, Brown MK, et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods. 2008;5(7):613-9. [Crossref]  [PubMed] 
  17. Osabe T, Shimizu K, Kadota K. Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data. BMC Bioinformatics. 2021;22(1):511.[Crossref]  [PubMed]  [PMC] 
  18. Si Y, Liu P, Li P, Brutnell TP. Model-based clustering for RNA-seq data. Bioinformatics. 2014;30(2):197-205. [Crossref]  [PubMed] 
  19. Le H, Peng B, Uy J, Carrillo D, Zhang Y, Aevermann BD, et al. Machine learning for cell type classification from single nucleus RNA sequencing data. PLoS One. 2022;17(9):e0275070. [Crossref]  [PubMed]  [PMC] 
  20. Hopper MA, Wenzl K, Hartert KT, Krull JE, Dropik AR, Novak JP, et al. Molecular classification and identification of an aggressive signature in low-grade B-cell lymphomas. Hematol Oncol. 2023. [PubMed] 
  21. Shen R, Fu D, Dong L, Zhang M, Shi Q, Shi Z, et al. Simplified algorithm for genetic subtyping in diffuse large B-cell lymphoma. Signal Transduction and Targeted Therapy. 2023;8(1):145. [Crossref]  [PubMed]  [PMC] 
  22. Rahman T, Huang H-E, Li Y, Tai A-S, Hseih W-P, McClung CA, et al. A sparse negative binomial classifier with covariate adjustment for RNA-seq data. Ann Appl Stat. 2022;16(2):1071-89. [Crossref] 
  23. Das S, Rai SN. Statistical methods for analysis of single-cell RNA-sequencing data. MethodsX. 2021;8:101580. [Crossref]  [PubMed]  [PMC] 
  24. Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11(3):R25. [Crossref]  [PubMed]  [PMC] 
  25. Han H, Men K. How does normalization impact RNA-seq disease diagnosis? J Biomed Inform. 2018;85:80-92. [Crossref]  [PubMed] 
  26. Zhou Y, Wan X, Zhang B, Tong T. Classifying next-generation sequencing data using a zero-inflated Poisson model. Bioinformatics. 2018;34(8):1329-35. [Crossref]  [PubMed] 
  27. Zhu J, Yuan Z, Shu L, Liao W, Zhao M, Zhou Y. Selecting classification methods for small samples of next-generation sequencing data. Front Genet. 2021;12:642227. [Crossref]  [PubMed]  [PMC] 
  28. Song JK, Zhang Y, Fei XY, Chen YR, Luo Y, Jiang JS, et al. Classification and biomarker gene selection of pyroptosis-related gene expression in psoriasis using a random forest algorithm. Front Genet. 2022;13:850108. [Crossref]  [PubMed]  [PMC] 
  29. Zhou Y, Peng M, Yang B, Tong T, Zhang B, Tang N. scDLC: a deep learning framework to classify large sample single-cell RNA-seq data. BMC Genomics. 2022;23(1):504. [Crossref]  [PubMed]  [PMC] 
  30. Chen X, Balko JM, Ling F, Jin Y, Gonzalez A, Zhao Z, et al. Convolutional neural network for biomarker discovery for triple negative breast cancer with RNA sequencing data. Heliyon. 2023;9(4):e14819. [Crossref]  [PubMed]  [PMC] 

.: Up To Date

Login



Contact


Ortadoğu Reklam Tanıtım Yayıncılık Turizm Eğitim İnşaat Sanayi ve Ticaret A.Ş.

.: Address

Turkocagi Caddesi No:30 06520 Balgat / ANKARA
Phone: +90 312 286 56 56
Fax: +90 312 220 04 70
E-mail: info@turkiyeklinikleri.com

.: Manuscript Editing Department

Phone: +90 312 286 56 56/ 2
E-mail: yaziisleri@turkiyeklinikleri.com

.: English Language Redaction

Phone: +90 312 286 56 56/ 145
E-mail: tkyayindestek@turkiyeklinikleri.com

.: Marketing Sales-Project Department

Phone: +90 312 286 56 56/ 142
E-mail: reklam@turkiyeklinikleri.com

.: Subscription and Public Relations Department

Phone: +90 312 286 56 56/ 118
E-mail: abone@turkiyeklinikleri.com

.: Customer Services

Phone: +90 312 286 56 56/ 118
E-mail: satisdestek@turkiyeklinikleri.com

1. TERMS OF USE

1.1. To use the web pages with http://www.turkiyeklinikleri.com domain name or the websites reached through the sub domain names attached to the domain name (They will be collectively referred as "SITE"), please read the conditions below. If you do not accept these terms, please cease to use the "SITE." "SITE" owner reserves the right to change the information on the website, forms, contents, the "SITE," "SITE" terms of use anytime they want.

1.2. The owner of the "SITE" is Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc. (From now on it is going to be referred as "Turkiye Klinikleri", shortly) and it resides at Turkocagi cad. No:30, 06520 Balgat Ankara. The services in the "SITE" are provided by "Turkiye Klinikleri."

1.3. Anyone accessing the "SITE" with or without a fee whether they are a natural person or a legal identity is considered to agree these terms of use. In this contract hereby, "Turkiye Klinikleri" may change the stated terms anytime. These changes will be published in the "SITE" periodically and they will be valid when they are published. Any natural person or legal identity benefiting from and reaching to the "SITE" are considered to be agreed to any change on hereby contract terms done by "Turkiye Klinikleri."

1.4. The "Terms of Use" hereby is published in the website with the last change on March 30th 2014 and the "SITE" is activated by enabling the access to everyone. The "Terms of Use" hereby is also a part of the any "USER Contract" was and/or will be done with the users using "Turkiye Klinikleri" services with or without a fee an inseparable.

2. DEFINITIONS

2.1. "SITE": A website offering different kind of services and context with a certain frame determined by "Turkiye Klinikleri" and it is accessible on-line on http://www.turkiyeklinikleri.com domain name and/or subdomains connected to the domain name.

2.2. USER: A natural person or a legal identity accessing to the "SITE" through online settings.

2.3. LINK: A link enabling to access to another website through the "SITE", the files, the context or through another website to the "SITE", the files and the context.

2.4. CONTEXT: Any visual, literary and auditory images published in the "Turkiye Klinikleri", "SITE" and/or any website or any accessible information, file, picture, number/figures, price, etc.

2.5. "USER CONTRACT": An electronically signed contract between a natural or a legal identity benefiting from special services "Turkiye Klinikleri" will provide and "Turkiye Klinikleri".

3. SCOPE OF THE SERVICES

3.1. "Turkiye Klinikleri" is completely free to determine the scope and quality of the services via the "SITE".

3.2. To benefit the services of "Turkiye Klinikleri" "SITE", the "USER" must deliver the features that will be specified by "Turkiye Klinikleri". "Turkiye Klinikleri" may change this necessity any time single-sided.

3.3. Not for a limited number, the services "Turkiye Klinikleri" will provide through the "SITE" for a certain price or for free are;

- Providing scientific articles, books and informative publications for health industry.

- Providing structural, statistical and editorial support to article preparation stage for scientific journals.

4. GENERAL PROVISIONS

4.1. "Turkiye Klinikleri" is completely free to determine which of the services and contents provided in the "SITE" will be charged.

4.2. People benefiting from the services provided by "Turkiye Klinikleri" and using the website can use the "SITE" only according to the law and only for personal reasons. Users have the criminal and civil liability for every process and action they take in the "SITE". Every USER agrees, declares and undertakes that they will not proceed by any function or action infringement of rights of "Turkiye Klinikleri"s and/or other third parties', they are the exclusive right holder on usage, processing, storage, made public and revealing any written, visual or auditory information reported to Turkiye Klinikleri" and/or "SITE" to the third parties. "USER" agrees and undertakes that s/he will not duplicate, copy, distribute, process, the pictures, text, visual and auditory images, video clips, files, databases, catalogs and lists within the "SITE", s/he will not be using these actions or with other ways to compete with "Turkiye Klinikleri", directly or indirectly.

4.3. The services provided and the context published within the "SITE" by third parties is not under the responsibility of "Turkiye Klinikleri", institutions collaborated with "Turkiye Klinikleri", "Turkiye Klinikleri" employee and directors, "Turkiye Klinikleri" authorized salespeople. Commitment to accuracy and legality of the published information, context, visual and auditory images provided by any third party are under the full responsibility of the third party. "Turkiye Klinikleri" does not promise and guarantee the safety, accuracy and legality of the services and context provided by a third party.

4.4. "USER"s cannot act against "Turkiye Klinikleri", other "USER"s and third parties by using the "SITE". "Turkiye Klinikleri" has no direct and/or indirect responsibility for any damage a third party suffered or will suffer regarding "USER"s actions on the "SITE" against the rules of the hereby "Terms of Use" and the law.

4.5. "USER"s accept and undertake that the information and context they provided to the "SITE" are accurate and legal. "Turkiye Klinikleri" is not liable and responsible for promising and guaranteeing the verification of the information and context transmitted to "Turkiye Klinikleri" by the "USER"s, or uploaded, changed and provided through the "SITE" by them and whether these information are safe, accurate and legal.

4.6. "USER"s agree and undertake that they will not perform any action leading to unfair competition, weakening the personal and commercial credit of "Turkiye Klinikleri" and a third party,  encroaching and attacking on personal rights within the "SITE" in accordance with the Turkish Commercial Code Law.

4.7. "Turkiye Klinikleri" reserves the right to change the services and the context within the "SITE"  anytime. "Turkiye Klinikleri" may use this right without any notification and timelessly. "USER"s have to make the changes and/or corrections "Turkiye Klinikleri" required immediately. Any changes and/or corrections that are required by "Turkiye Klinikleri", may be made by "Turkiye Klinikleri" when needed. Any harm, criminal and civil liability resulted or will result from changes and/or corrections required by "Turkiye Klinikleri" and were not made on time by the "USER"s belongs completely to the users.

4.8. "Turkiye Klinikleri" may give links through the "SITE" to other websites and/or "CONTEXT"s and/or folders that are outside of their control and owned and run by third parties. These links are provided for ease of reference only and do not hold qualification for support the respective web SITE or the admin or declaration or guarantee for the information inside. "Turkiye Klinikleri" does not hold any responsibility over the web-sites connected through the links on the "SITE", folders and context, the services or products on the websites provided through these links or their context.

4.9. "Turkiye Klinikleri" may use the information provided to them by the "USERS" through the "SITE" in line with the terms of the "PRIVACY POLICY" and "USER CONTRACT". It may process the information or classify and save them on a database. "Turkiye Klinikleri" may also use the USER's or visitor's identity, address, e-mail address, phone number, IP number, which sections of the "SITE" they visited, domain type, browser type, date and time information to provide statistical evaluation and customized services.

5. PROPRIETARY RIGHTS

5.1. The information accessed through this "SITE" or provided by the users legally and all the elements (including but not limited to design, text, image, html code and other codes) of the "SITE" (all of them will be called as studies tied to "Turkiye Klinikleri"s copyrights) belongs to "Turkiye Klinikleri". Users do not have the right to resell, process, share, distribute, display or give someone permission to access or to use the "Turkiye Klinikleri" services, "Turkiye Klinikleri" information and the products under copyright protection by "Turkiye Klinikleri". Within hereby "Terms of Use" unless explicitly permitted by "Turkiye Klinikleri" nobody can reproduce, process, distribute or produce or prepare any study from those under "Turkiye Klinikleri" copyright protection.

5.2. Within hereby "Terms of Use", "Turkiye Klinikleri" reserves the rights for "Turkiye Klinikleri" services, "Turkiye Klinikleri" information, the products associated with "Turkiye Klinikleri" copyrights, "Turkiye Klinikleri" trademarks, "Turkiye Klinikleri" trade looks or its all rights for other entity and information it has through this website unless it is explicitly authorized by "Turkiye Klinikleri".

6. CHANGES IN THE TERMS OF USE

"Turkiye Klinikleri" in its sole discretion may change the hereby "Terms of Use" anytime announcing within the "SITE". The changed terms of the hereby "Terms of Use" will become valid when they are announced. Hereby "Terms of Use" cannot be changed by unilateral declarations of users.

7. FORCE MAJEURE

"Turkiye Klinikleri" is not responsible for executing late or never of this hereby "Terms of Use", privacy policy and "USER Contract" in any situation legally taken into account as force majeure. Being late or failure of performance or non-defaulting of this and similar cases like this will not be the case from the viewpoint of "Turkiye Klinikleri", and "Turkiye Klinikleri" will not have any damage liability for these situations. "Force majeure" term will be regarded as outside of the concerned party's reasonable control and any situation that "Turkiye Klinikleri" cannot prevent even though it shows due diligence. Also, force majeure situations include but not limited to natural disasters, rebellion, war, strike, communication problems, infrastructure and internet failure, power cut and bad weather conditions.

8. LAW AND AUTHORISATION TO FOLLOW

Turkish Law will be applied in practicing, interpreting the hereby "Terms of Use" and managing the emerging legal relationships within this "Terms of Use" in case of finding element of foreignness, except for the rules of Turkish conflict of laws. Ankara Courts and Enforcement Offices are entitled in any controversy happened or may happen due to hereby contract.

9. CLOSING AND AGREEMENT

Hereby "Terms of Use" come into force when announced in the "SITE" by "Turkiye Klinikleri". The users are regarded to agree to hereby contract terms by using the "SITE". "Turkiye Klinikleri" may change the contract terms and the changes will be come into force by specifying the version number and the date of change on time it is published in the "SITE".

 

30.03.2014

Privacy Policy

We recommend you to read the terms of use below before you visit our website. In case you agree these terms, following our rules will be to your favor. Please read our Terms of Use thoroughly.

www.turkiyeklinikleri.com website belongs to Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc. and is designed in order to inform physicians in the field of health

www.turkiyeklinikleri.com cannot reach to user’s identity, address, service providers or other information. The users may send this information to the website through forms if they would like to. However, www.turkiyeklinikleri.com may collect your hardware and software information. The information consists of your IP address, browser type, operating system, domain name, access time, and related websites. www.turkiyeklinikleri.com cannot sell the provided user information (your name, e-mail address, home and work address, phone number) to the third parties, publish it publicly, or keep it in the website. Gathered information has a directing feature to be a source for the website’s visitor profile, reporting and promotion of the services.

www.turkiyeklinikleri.com uses the taken information:

-To enhance, improve and maintain the quality of the website

-To generate visitor’s profile and statistical data

-To determine the tendency of the visitors on using our website

-To send print publications/correspondences

-To send press releases or notifications through e-mail

-To generate a list for an event or competition

By using www.turkiyeklinikleri.com you are considered to agree that;

-Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc. cannot be hold responsible for any user’s illegal and immoral behavior,

-Terms of use may change from time to time,

-It is not responsible for other websites’ contents it cannot control or the harms they may cause although it uses the connection they provided.

Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc. may block the website to users in the following events:

-Information with wrong, incomplete, deceiving or immoral expressions is recorded to the website,

-Proclamation, advertisement, announcement, libelous expressions are used against natural person or legal identity,

-During various attacks to the website,

-Disruption of the website because of a virus.

Written, visual and audible materials of the website, including the code and the software are under protection by legal legislation.

Without the written consent of Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc. the information on the website cannot be downloaded, changed, reproduced, copied, republished, posted or distributed.

All rights of the software and the design of the website belong to Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc.

Ortadoğu Advertisement Presentation Publishing Tourism Education Architecture Industry and Trade Inc. will be pleased to hear your comments about our terms of use. Please share the subjects you think may enrich our website or if there is any problem regarding our website.

info@turkiyeklinikleri.com