ISSN 2071-8594

Российская академия наук

Главный редактор

Академик С. В. Емельянов

Н.В. Корепанова "Машинное обучение для оптимизации лечения в подгруппах пациентов"

Аннотация.

Клинические исследования показывают, что часто эффект от лечения оказывается зависимым от различных признаков пациента: клинических, антропологических, генетических, психологических, социальных и т.д. Выявление подобного рода зависимостей составляет задачу персонифицированной медицины и способствует созданию стратегий лечения, более адаптированных под конкретного пациента. В данной работе представлен обзор подходов к анализу данных клинических исследований для поиска признаков, влияющих на эффективность лечения, и выделения подгрупп пациентов, для которых есть существенные различия в эффективности экспериментального и контрольного лечения.

Ключевые слова:

персонифицированная медицина, анализ подгрупп, клинические исследования, машинное обучение.

Стр. 54-66.

Литература

1. Brookes S.T., Whitley E., Peters T.J., Mulheran P.A., Egger M., Smith G.D. Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives // Health Technology Assessment, Vol. 5, No. 33, 2001. pp. 1-56.
2. Cook D.I., Gebski V.J., Keech A.C. Subgroup analysis in clinical trials // Medical Journal of Australia, Vol. 180, No. 6, 2004. pp. 289-291.
3. Grouin J.M., Coste M., Lewis J. Subgroup Analyses in Randomized Clinical Trials: Statistical and Regulatory Issues // Journal of Biopharmaceutical Statistics, Vol. 15, No. 5, 2005. pp. 869-882.
4. Kent D.M., Rothwell P.M., Ioannidis J.P.A., Altman D.G., Hayward R.A. Assessing and reporting heterogeneity in treatment effects in clinical trials: a proposal // Trials, Vol. 11, No. 1, 2010. P. 85.
5. Pocock S.J., Assmann S.E., Enos L.E., Kasten L.E. Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems // Statistics in Medicine, Vol. 21, No. 19, 2002. pp. 2917-2930.
6. Rothwell P.M. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation // Lancet, Vol. 365, No. 9454, 2005. pp. 176-186.
7. Sleight P. Debate: Subgroup analyses in clinical trials — fun to look at, but don’t believe them! // Current Controlled Trials in Cardiovascular Medicine, Vol. 1, No. 1, 2000. pp. 25-27.
8. Sun X., Briel M., Walter S.D., Guyatt G.H. Is a subgroup effect believable? Updating criteria to evaluate the credibility of subgroup analyses // British Medical Journal, Vol. 340, 2010. pp. 850-854.
9. Wang R., Lagakos S.W., Ware J.H., Hunter D.J., Drazen J.M. Statistics in Medicine - Reporting of Subgroup Analyses in Clinical Trials // The New England Journal of Medicine, Vol. 357, 2007. pp. 2189-2194.
10. Lipkovich I., Dmitrienko A., DAgostino R.B. Tutorial in Biostatistics: Data-Driven Subgroup Identification and Analysis in Clinical Trials // Statistics in Medicine, Vol. 36, No. 1, 2016. pp. 136-196.
11. Meinshausen N., Meier L., Bühlmann P. p-Values for High-Dimensional Regression // Journal of the American Statistical Association, Vol. 104, No. 488, 2009. pp. 1671-1681.
12. Lockhart R., Taylor J., Tibshirani R.J., Tibshirani R. A Significance Test for the Lasso // Annals of Statistics, Vol. 42, No. 2, 2014. pp. 413-468.
13. Freidlin B., Simon R. Adaptive Signature Design: An Adaptive Clinical Trial Design for Generating and Prospectively Testing A Gene Expression Signature for Sensitive Patients // Clinical Cancer Research, Vol. 11, No. 21, 2005. pp. 7872-7878.
14. Meinshausen N., Bühlmann P. Stability selection // Journal of the Royal Statistical Society, Series B, Vol. 72, No. 4, 2010. pp. 417-423.
15. Gunter L., Zhu J., Murphy S. Variable Selection for Qualitative Interactions in Personalized Medicine while Controlling The Familywise Error Rate // Journal of Biopharmaceutical Statistics, Vol. 21, No. 6, 2011. pp. 1063-1078.
16. Simon R.M., Subramanian J., Li M.C., Menezes S. Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data // Briefings in Bioinformatics, Vol. 12, No. 3, 2011. pp. 203-214.
17. Good P. Resampling Methods: a Practical Guide to Data Analysis. 3rd ed. Boston: Birkhauser, 2005.
18. Loh W.Y., Shih Y.S. Split Selection Methods for Classification Trees // Statistica Sinica, Vol. 7, 1997. pp. 815-840.
19. Hothorn T., Hornik K., Zeileis A. Unbiased Recursive Partitioning: A Conditional Inference Framework // Journal of Computational and Graphical Statistics, Vol. 15, No. 3, 2006. pp. 651-674.
20. Hesterberg T., Moore D.S., Monaghan S., Clipson A., R.E. Bootstrap methods and permutation tests. Vol 5. // In: The Practice of Business Statistics / Ed. by Moore D.S. W.H. Freeman, 2005. pp. 1-70.
21. Varma S., Simon R. Bias in error estimation when using cross-validation for model selection // BMC Bioinformatics, Vol. 7, 2006. P. 91.
22. Foster J.C., Taylor J., Ruberg S.J. Subgroup identification from randomized clinical trial data. // Statistics in Medicine, Vol. 30, No. 24, 2011. pp. 2867-2880.
23. Dixon D.O., Simon R. Bayesian Subset Analysis // Biometrics, Vol. 47, No. 3, 1991. pp. 871-881.
24. Berger J.O., Wang X., Shen L. A Bayesian Approach to Subgroup Identification // Journal of Biopharmaceutical Statistics, Vol. 24, No. 1, 2014. pp. 110-129.
25. Xu Y., Trippa L., Müller P., Ji Y. Subgroup-Based Adaptive (SUBA) Designs for Multi-Arm Biomarker Trials // Statistics in Biosciences, Vol. 8, No. 1, 2016. pp. 159-180.
26. Xu Y., Yu M., Zhao Y.Q., Li Q., Wang S., Shao J. Regularized Outcome Weighted Subgroup Identification for Differential Treatment Effects // Biometrics, Vol. 71, No. 3, Sep 2015. pp. 645-653.
27. Little R.J., Rubin D.R. Causal effects in clinical and epidemiological studies via potential outcomes. // Annual Review of Public Health, Vol. 21, 2000. pp. 121-145.
28. Cox D.R. Regression Models and Life-Tables // Journal of the Royal Statistical Society. Series B, Vol. 34, No. 2, 1972. pp. 187-220.
29. Breiman L., Friedman J.H., Olshen R.A., Stone C.J. Classification and Regression Trees. Wadsworth: Belmont, CA, 1984.
30. Royston P., Altman D.G. Regression Using Fractional Polynomials of Continuous Covariates: Parsimonious Parametric Modelling // Journal of the Royal Statistical Society. Series C, Vol. 43, No. 3, 1994. pp. 429-467.
31. Royston P., Sauerbrei W. A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials // Statistics in Medicine, Vol. 23, No. 16, 2004. pp. 2509-2525.
32. Tibshirani R. Regression shrinkage and selection via the lasso. // Journal of the Royal Statistical Society. Series B, Vol. 58, No. 1, 1996. pp. 267-288.
33. Imai K., Ratkovic M. Estimating Treatment Effect Heterogeneity in Randomized Program Evaluation // Tha Annals of Applied Statistics, Vol. 7, No. 1, 2013. pp. 443-470.
34. Cai T., Tian L., Peggy H W., Wei L.J. Analysis of randomized comparative clinical trial data for personalized treatment selections // Biometrics, Vol. 12, No. 2, 2011. pp. 270-282.
35. Song X., Pepe M.S. Evauating Markers for Selecting a Patient's Treatment // Biometrics, Vol. 60, No. 4, 2004. pp. 874-883.
36. Huang Y., Gilbert P.B., Janes H. Assessing Treatment-Selection Markers using a Potential Outcomes Framework // Biometrics, Vol. 68, No. 3, 2012. pp. 687-696.
37. Zhao L., Tian L., Cai T., Claggett B., Wei L.J. Effectively Selecting a Target Population for a Future Comparative Study // Journal of the American Statistical Association, Vol. 108, No. 502, 2013. pp. 527-539.
38. Breiman L. Random forests // Machine Learning, Vol. 45, No. 1, 2001. pp. 5-32.
39. Dusseldorp E., Conversano C., Van Os B.J. Combining an Additive and Tree-Based Regression Model Simultaneously: STIMA // Journal of Computational and Graphical Statistics, Vol. 19, No. 3, 2010. pp. 514-530.
40. Hodges J.S., Cui Y., Sargent D.J., Carlin B.P. Smoothing Balanced Single-Error-Term Analysis of Variance // Technometrics, Vol. 49, No. 1, 2007. pp. 12-25.
41. Gu X., Yin C., Lee J.J. Bayesian Two-step Lasso Strategy for Biomarker Selection in Personalized Medicine Development for Time-to-Event Endpoints // Contemporary Clinical Trials, Vol. 36, No. 2, 2013. pp. 642-650.
42. Negassa A., Ciampi A., Abrahamowicz M., Shapiro S., Boivin J.F. Tree-structured subgroup analysis for censored survival data: Validation of computationally inexpensive model selection criteria // Statistics and Computing, Vol. 15, No. 3, 2005. pp. 231-239.
43. Su X., Tsai C.L., Wang H., Nickerson D.M., Li B. Subgroup Analysis via Recursive Partitioning // Journal of Machine Learning Research, Vol. 10, 2009. pp. 141-158.
44. Su X., Zhou T., Yan X. Interaction Trees with Censored Survival Data // The International Journal of Biostatistics, Vol. 4, No. 1, 2008. P. 2.
45. Loh W.W., He X., Man M. A regression tree approach to identifying subgroups with differential treatment effects //Statistics in Medicine, Vol. 34, No. 11, 2015. pp. 1818-1833.
46. Loh W.Y., Fu H., Man M., Champion V., Yu M. Identification of subgroups with differential treatment effects for longitudinal and multiresponse variables // Statistics in Medicine, Vol. 35, No. 26, 2016. pp. 4837-4855.
47. Loh W.Y. Regression Trees with Unbiased Variable Selection and Interaction Detection // Statistica Sinica, Vol. 12, 2002. pp. 361-386.
48. Zeileis A., Hothorn T., Hornik K. Model-based Recurcive Partitioning // Journal of Computational and Graphical Statistics, Vol. 17, No. 2, 2008. pp. 492-514.
49. Dusseldorp E., Mechelen I.V. Qualitative Interaction Trees: a tool to identify qualitative treatment-subgroup interactions // Statistics in Medicine, Vol. 33, No. 2, 2014. pp. 219-237.
50. Tian L., Alizadeh A.A., Gentles A.J., Tibshirani R. A Simple Method for Estimating Interactions between aTreatment and a Large Number of Covariates // Journal of the Americal Statistical Association, Vol. 109, No. 508, 2014. pp. 1517-1532.
51. Jones H.E., Ohlssen D.I., Neuenschwander B., Racine A., Branson M. Bayesian Models for Subgorup Analysis in Clinical Trials // Clinical Trials, Vol. 8, No. 2, 2011. pp. 129-143.
52. Qian M., Murphy S.A. Performance guarantees for individualized treatment rules // The Annals of Statistics, Vol. 39, No. 2, 2011. pp. 1180-1210.
53. Zhao Y., Zheng D., Rush A.J., Kosorok M.R. Estimating individualized treatment rules using outcome weighted learning. // Journal of the American Statistical Association, Vol. 107, No. 449, 2012. pp. 1106-1118.
54. Lu W., Zhang H.H., Zeng D. Variable Selection for Optimal Treatment Decision // Statistical Methods in Medical Research, Vol. 22, No. 5, 2013. pp. 493-504.
55. Foster J., Taylor J.M.G., Kaciroti N., Nan B. Simple subgroup approximations to optimal treatment regimes from randomized clinical trial data // Biostatistics, Vol. 16, No. 2, 2015. pp. 368-382.
56. Zhang B., Tsiatis A.A., Davidian M., Zhang M., Laber E. Estimating Optimal Treatment Regimes from a Classification Perspective // Statistics, Vol. 1, No. 1, 2012. pp. 103-114.
57. Zhang B., Tsiatis A.A., Laber E.B., Davidian M. Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions // Biometrika, Vol. 100, No. 3, 2013. pp. 681-694.
58. Laber E.B., Zhao Y.Q. Tree-base methods for individualized treatment regimes // Biometrika, Vol. 102, No. 3, 2015. pp. 501-514.
59. Zhang Y., Laber E.B., Tsiatis A., Davidian M. Using decision lists to construct interpretable and parsimonious treatment regimes // Biometrics, Vol. 71, No. 4, 2015. pp. 895-904.
60. Fu H., Zhou J., Faries D.E. Estimating optimal treatment regimes via subgroup identification in randomized control trials and observational studies // Statistics in Mdeicine, Vol. 35, No. 19, 2016. pp. 3285-3302.
61. Lo V.S.Y. The true lift model: a novel data mining approach to response modeling in database marketing. // SIGKDD Explorations, Vol. 4, No. 2, 2002. pp. 78-86.
62. Larsen K. Net lift models: optimizing the impact of your marketing. // Predictive analytics world. 2011. Vol. Workshop presentation.
63. Robins J.M. Correcting for non-compliance in randomized trials using structural nested mean models // Communications in Statistics - Theory and Methods, Vol. 23, No. 8, 1994. pp. 2379-2412.
64. Robins J., Rotnitzky N. Estimation of treatment effects in randomised trials with non-comliance and a dichotomous outcome using structural mean models // Biometrika, Vol. 91, No. 4, 2004. pp. 763-783.
65. Jaskowski M., Jaroszewicz S. Uplift modeling for clinical trial data // ICML, 2012 workshop on machine learning for clinical data analysis. Edinburgh. Scotland. 2012.
66. Radcliffe N.J., Surry P.D. Differential analysis: modeling true response by isolating the effect of a single action. //Proceedings of credit scoring and credit control VI. 1999.
67. Radcliffe N.J., Surry P.D. Real-world uplift modeling with significance-based uplift trees., Portrait Technical Report TR-2011-1, stochastic solutions, Tech. rep. 2011.
68. Hansotia B., Rukstales B. Incremental Value Modeling // Journal of Interactive Marketing, Vol. 16, No. 3, 2002. pp. 35-46.
69. Chickering D.M., Heckerman D. A decision theoretic approach to targeted advertising. // Proceedings of the 16th conference in uncertainty in artificail intelligence (UAI'00). 2000. pp. 82-88.
70. Rzepakowski P., Jaroszewicz S. Decision trees for uplift modeling // Proceedings of the 10th IEEE International conference on data mining (ICDM). Sydney. Australia. 2010. pp. 441-450.
71. Rzepakowski P., Jaroszewicz S. Decision trees for uplift modeling wth single and multiple treatments // Knowledge and Information Systems, Vol. 32, No. 2, 2012. pp. 303-327.
72. Kuusisto F., Costa V.S., Nassif H., Burnside E., Page D., Shavlik J. Support vector machines for differenctial prediction // Proceedings of the ECML-PKDD. 2014.
73. Jaroszewicz S., L. Zaniewicz Ł. Székely regularization for uplift modeling. // In: Challenges in computational statistics and data mining. Springer International Publishing, 2016. pp. 135-154.
74. Zaniewicz L., Jaroszewicz S. Lp - Support vector machines for uplift modeling // Knowledge and Information Systems, Vol. 53, No. 1, 2017. pp. 269-296.
75. Guelman L., Guillen M., Perez-Marin A.M. Random forests for uplift modeling: an insurance customer retention case. Vol 115. // In: Modeling and simulation in engingeering, economics and management. Springer, Berlin, 2012. pp. 123-133.
76. Soltys M., Jaroszewicz S., Rzepakowski P. Ensemble methods for uplift modeling // Data Mining and Knowledge Discovery, Vol. 29, No. 6, 2015. pp. 1531-1559.
77. Chen G., Zhong H., Belousov A., Devanarayan V. A PRIM approach to predictive-signature development for patient stratification // Statistics in Medicine, Vol. 34, No. 2, 2015.
pp. 317-342.
78. Lipkovich I., Dmitrienko A., Denne J., Enas G. Subgroup identification based on differential effect search—A recursive partitioning method for establishing response to treatment in patient subpopulations // Statistics in Medicine, Vol. 30, No. 21, 2011. pp. 2601-2621.
79. Friedman J.H., Fisher N.I. Bump Hunting in High-Dimensional Data // Statistics and Computing, Vol. 9, No. 2, 1999. pp. 123-243.
80. Sivaganesan S., Laudb P.W., Müller P. A Bayesian subgroup analysis with a zero-enriched Polya Urn scheme // Statistics in Medicine, Vol. 30, No. 4, 2010. pp. 312-323.
81. Korepanova N., Kuznetsov S.O., Karachunskiy A.I. Matchings and Decision Trees for Determining Optimal Therapy // In: Analysis of Images, Social Networks and Texts Third International Conference, AIST 2014, Yekaterinburg, Russia, April 10-12, 2014, Revised Selected Papers. Springer International Publishing, 2014. pp. 101-110.
82. Gale D., Shapley L.S. College Admissions and the Stability of Marriage // The American Mathematical Monthly, Vol. 69, No. 1, 1962. pp. 9-15.
83. Roth A.E. Differed acceptance algorithm: history, theory, practice, and open questions, Harvard University, 2007.
84. Alkan A., Gale D. Stable schedule matching under revealed preference // Journal of Economic Theory, Vol. 112, 2003. pp. 289-306.
85. Ganter B., Kuznetsov S.O. Pattern Structures and Their Projections // 9th International Conference on Conceptual Structures (ICCS 2001). 2001. Vol. 2120. pp. 129-142.
86. Kuznetsov S.O. Pattern Structures for Analyzing Complex Data // 12th International Conference on Rough Sets,Fuzzy Sets, Data Mining and Granular Computing (RSFDGrC 2009). 2009. Vol. 5908. pp. 33-44.
87. Korepanova N., Kuznetsov S.O. Pattern Structures for Treatment Optimization // In: CLA 2016: Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications. CEUR Workshop Proceedings. Moscow: Higher School of Economics, National Research University, 2016. pp. 217-228.
88. Корепанова Н.В., Кузнецов С.О. Выбор терапии онкологического заболевания в подгруппах пациентов на основе анализа замкнутых описаний // Пятнадцатая национальная конференция по искусственному интеллекту с международным участием КИИ-2016 (3-7 октября 2016 г., г. Смоленск, Россия). Труды конференции. В 3-х томах. Смоленск. Россия. 2016. Т. 1. С. 352-359.