Significance: Automated understanding of human embryonic stem cell (hESC) videos is essential for the quantified analysis and classification of various states of hESCs and their health for diverse applications in regenerative medicine. Aim: This paper aims to develop an ensemble method and bagging of deep learning classifiers as a model for hESC classification on a video dataset collected using a phase contrast microscope. Approach: The paper describes a deep learning-based random network (RandNet) with an autoencoded feature extractor for the classification of hESCs into six different classes, namely, (1) cell clusters, (2) debris, (3) unattached cells, (4) attached cells, (5) dynamically blebbing cells, and (6) apoptotically blebbing cells. The approach uses unlabeled data to pre-train the autoencoder network and fine-tunes it using the available annotated data. Results: The proposed approach achieves a classification accuracy of 97.23 ± 0.94 % and outperforms the state-of-the-art methods. Additionally, the approach has a very low training cost compared with the other deep-learning-based approaches, and it can be used as a tool for annotating new videos, saving enormous hours of manual labor. Conclusions: RandNet is an efficient and effective method that uses a combination of subnetworks trained using both labeled and unlabeled data to classify hESC images. |
1.IntroductionHuman embryonic stem cells (hESCs) are derived from the inner cell mass of developing blastocysts and possess two important properties: (1) self-renewal and (2) pluripotency.1–3 Self-renewal is the ability to go through unlimited cycles of cell division, and pluripotency is the capability to differentiate into any cell type in the human body. hESCs are an important resource for regenerative medicine, basic research on human prenatal development, and toxicological testing of drugs and environmental chemicals. Under their state of pluripotency, they can also be maintained indefinitely.4,5 hESC classification is an important task for toxicity studies. Through classification of hESCs in time-lapsed videos, biologists can analyze apoptotic behaviors in both cell clusters and individual cells under certain test chemicals. Therefore, understanding the behavior of hESCs is fundamental for medicinal and toxicological research.5–8 The classification of hESCs in video is essential for quantifiable analysis of hESC processes and behavior.9 However, manual analysis of stem cells is laborious, tedious, and often inaccurate due to three main human limitations. First, the accuracy of a human performing classification is inversely proportional to long working hours. Second, uncertainty in classification occurs due to a wide variety of objects that appear in a class. Third, the amount of time put into working on datasets can lead to confusion in classifying hESCs into the right classes. Figure 1 shows a modularized system overview for an automated segmentation and classification process. In this paper, we focus essentially on the classification of the detected components from hESC videos; the detected components are the six general classes shown in Fig. 1. Guan et al.3 provide details of a method for the fast detection and segmentation of individual video components. Because phase contrast imaging is a non-invasive microscopy technique, it is widely used to study the behavior of live hESCs in video.10 In this study, the hESC videos were taken with a BioStation IM.11 The Biostation has an incubator with time-lapsed video capability. Each video captures an assay. The BioStation IM enables video capture of living cells under a stable and optimal environment. More details about BioStation IM and the images can be found in Talbot et al.7 The hESC videos consist of frames of phase contrast images. Each frame can contain any of the following six general components: (1) cell clusters, (2) debris, (3) unattached cells, (4) attached cells, (5) dynamically blebbing cells, and (6) apoptotically blebbing cells. Among these unattached, attached, dynamically blebbing, and apoptotically blebbing cells are the four classes that are of significant interest in experimental work. These four classes are regarded as the four intrinsic cell types in a video. Figure 2 shows examples of the six classes. Conceptually, the six classes of hESCs can be distinguished with three fundamental human perceptual capabilities for identification and classification of objects: (1) shape, (2) intensity, and (3) texture. Each class can be uniquely identified by one or a combination of the aforementioned human perceptions. For instance, the apoptotically blebbing cells in Fig. 2(f) are similar in intensity, shape, and texture among themselves. hESCs in Figs. 2(e) and 2(f) are dissimilar in intensity, but they are similar in shape and texture. The debris in Fig. 2(b) has similar intensity values as various classes shown in Fig. 2. Traditionally, a feature vector can be derived with the aforementioned human perceptions. However, with the advent of deep learning techniques, we can develop classification models with the given abundance of labeled data. Therefore, the need to generate a feature vector manually for a classification system is only suitable when data are quite limited. Fig. 2Six classes of hESCs from phase contrast images detected using the approach proposed by Guan et al:3 (a) cell clusters; (b) debris; (c) unattached cells; (d) attached cells; (e) dynamically blebbing cells; (f) apoptotically blebbing cells. It is to be noted that the cells are going through multiple states during the data collection (at every few minutes), which could last for 48 to 100 h. ![]() With the consideration that we often see an abundance of unlabeled data rather than labeled data, we propose a random network (RandNet) with an autoencoded feature extractor. The proposed method focuses on building random subnetworks with the feature extractor derived from unlabeled data. Moreover, the proposed method incorporated ensemble methodology in the network to reduce overfitting. 1.1.Related WorkTo develop a practical system with high classification accuracy, modularization structure is often preferred over a deep learning approach that simultaneously performs detection and segmentation because modularized components allow for flexibility and adaptability as shown in Fig. 3 and Refs. 1213.–14. We consider segmentation and classification to be two separate modularized components or subsystems. Additionally, direct classification from the input videos is extremely challenging because these are dynamic images evolving over time. In this paper, we focus on the classification component. There has been very limited work on building an automated classification system for stem cells in video with both labeled and unlabeled datasets.8 Niioka et al.15 used convolutional neural network (CNN) to study cellular differentiation from myoblasts to myotubes. Their classification model was built upon the concept that cellular morphology changes during differentiation, and this feature was easily captured in stained fluorescent images. In addition, Xie et al.16 worked on fluorescent images with CNN for cell counting. Although they have a successful experiment, their classification problem was simple since their images contained only circular dots. Chang et al.17 also used CNN for human induced pluripotent stem cell regions classification.17 Their study focused on classifying cell cluster patterns. The dataset used in the works by Niioka et al.,15 Xie et al.,16 and Chang et al.17 came from experiments that use staining techniques; staining is a very intrusive technique to be used on cells for contrast enhancement. However, our hESC experiments were done without staining. Similar work on stem cell classification with phase contrast images was proposed by Theagarajan et al.18,19 They suggested using a generative method to train the network and classify real data. However, they did not consider realistic unlabeled data, which can be efficiently generated for training; typical generative methods have huge computational cost for synthetic dataset generation as well as training with a large set of synthetic data. Therefore, this paper proposes using the unlabeled data (without the use of generative methods) for model training and fine-tuning the model with labeled data. 1.2.Contributions of this PaperIn this paper, we focus on the classification component. From Fig. 2, we can infer that there are four major challenges in hESC classification. First, when attached cells spread thin in the substrate, the cells are fused with the background. Second, dynamically blebbing cells and apoptotically blebbing cells are similar in intensity. Third, when a large attached cell goes through the apoptotic process, it appears as a cell cluster of apoptotically blebbing cells. Fourth, image data are obtained under both 10× and objectives, which adds challenges in discerning individual blebbing cells from cell clusters. In light of the state of the art, the contributions of this paper are as follows.
Section 2 presents the materials and methods in detail. Section 3 provides experimental results, and Sec. 4 provides a discussion on the proposed and compared methods. Finally, Sec. 5 presents the conclusions of the paper. 2.Materials and Methods2.1.MaterialsAll time lapse videos were obtained with the phase contrast microscope in BioStation IM.7,11 The videos were acquired using either a or objective with resolution. A total of 27,603 unlabeled gray scale images and 3559 labeled gray scale images were obtained from six videos and eight videos. Both unlabeled and labeled images were obtained automatically by the method described in Guan et al.3,20,21 The labeled dataset had the following number of gray scale images for each class: (1) 636 cell cluster images, (2) 773 debris images, (3) 519 unattached cell images, (4) 704 attached cell images, (5) 413 dynamically blebbing cell images, and (6) 514 apoptotically blebbing cell images. The ground-truth for the datasets were generated manually by stem cell experts. We used 75% of the dataset for training and the remaining 25% of the dataset for out-of-sample testing for each class. To generalize the classifier, five-fold cross validation was done during model learning. Model learning is performed with training data only. 2.2.MethodsIn this section, we first present the motivation for our proposed approach. This is followed by a method for automated cell region detection, which is the segmentation component. We then describe RandNe and elaborate on the autoencoded feature extractor as well as the pre-trained subnetworks for the classification component. The classification component is part of the modularized system as shown in Fig. 3. A pseudocode for building the RandNet model is also provided. 2.2.1.Motivation of the approachDomain knowledge often comes from human perception, which is the most complex yet efficient cognitive system. Through hypothetical assumeption and visual inspection, we can sometimes identify useful features of hESCs for classification. However, domain knowledge is limited by the amount of information the brain can absorb. With tens of thousands of unlabeled and labeled data, experts can have hard times in either conceptualizing or generalizing the hidden information contained in the data. Deep learning techniques can help to understand the vast amount of data and solve the difficulty in creating automated algorithms for repetitious tasks performed by humans. Consider the task of studying apoptotic processes of cells with test chemicals in a toxicity experiment. Observing the dynamic changes in the texture and shape of apoptotic processes of a cell requires a significant amount of manual labor for annotating individual video frames. Currently, biologists spend hours of manual labor in annotating these images, which is a very tedious and menial task. Our deep learning based approach can learn to automatically segment these frames from the vast amount of data available in an unsupervised manner, thus significantly reducing the amount of time biologists spend annotating images, which improves their efficiency. The proposed approach uses an unsupervised technique to build the foundation of the encoder network. The proposed method also uses of both the unlabeled and labeled data to build a reliable classification system. 2.2.2.Segmentation componentGuan et al.3 proposed a model based method for automatically segmenting hESCs. This automated cell region detection is an essential algorithm in developing automated frame component decomposition in hESC phase contrast videos. They considered the foreground and background intensity distribution to be a mixture of two Gaussians. The objective of their algorithm is to find an optimal threshold that optimizes a criterion derived from the intensity distribution of foreground and background. The optimal segmentation is achieved at the highest criterion value. Since the segmentation method yields a binary image for each frame, we were able to extract a pool of individual components from each frame. Figure 4 shows the detected components of frames under and objectives. These detected components are then ready to be classified into one of the six aforementioned classes. 2.2.3.Classification system overviewThe proposed classification system is built with both labeled and unlabeled data, and it consists of many random pre-trained subnetworks. The proposed method utilizes unlabeled data to build the encoder component in the pre-trained subnetworks and labeled data to fine-tune the RandNet. The RandNet structure also incorporates ensemble methodology to constrain overfitting. Figure 5 shows a graphical depiction of how RandNet is built with pre-trained subnetworks and the ensemble concept. 2.2.4.Random networkRandNet utilizes the concept of bagging in deep learning by creating subnetworks. Bagging or bootstrap aggregation is a machine learning concept used to reduce variance and avoid overfitting.22–25 RandNet, developed in this paper, is a method that contains many subnetworks that have a common pre-trained model and are fine-tuned with random samples. RandNet uses all of the results from each subnetwork and passes it to a stacking network in which the final decision is made. The detail of the stacking network is shown in Fig. 6. The stacking network is designed to be simple and has only two main dense layers. 2.2.5.Autoencoded feature extractorThe autoencoder network is an efficient unsupervised learning method that learns the representation of a set of data. The autoencoder network contains two major components: encoder and decoder.26–28 In this paper, we used a structure similar to AlexNet as the basis of an encoder, and then we designed a decoder network from it. Although the VGG architecture29 slightly outperforms AlexNet30 as shown in Sec. 3.3, this difference is not significant, and since the AlexNet architecture requires reduced computational resources, we chose it for its simple implementation. As shown in Fig. 5(a), the encoder generates a set of latent representations for the unlabeled data. The details of both encoder and decoder structures are shown in Fig. 7. The autoencoder network used the Adadelta optimizer31 and the pixel-wise binary cross-entropy loss function. Since the final layer in the autoencoder network was chosen to be a sigmoid activation layer, pixel-wise binary cross entropy is an applicable loss measure. The loss function equation is given as follows: where is the total pixel-wise loss in the autoencoder network, is the total number of sample images in a batch, and and are the total number of rows and columns, respectively. and are the ground-truth and predicted label values, respectively, in the ’th row and ’th column for the ’th sample. Both and .2.2.6.Pre-trained subnetworkThe subnetwork used the encoder structure derived from the autoencoder network [in Step 2, Fig. 5(b)] as the basis for building a subclassifier. Each pre-trained subnetwork is fine-tuned with random samples and has a topper structure. The layers of the topper structure are shown in Fig. 8. Fig. 8Topper structure. (Note: Dimensions without brackets are kernel dimensions of the current box, and dimensions with parentheses are output dimensions of the current box.) ![]() Since the encoder structure was unfrozen in each subnetwork, the fine-tuning with random samples affects the weights in the encoder structure. Therefore, we were able to emulate bagging for the proposed method. For this subnetwork, we use categorical cross entropy as our loss function, which is given as where is the total categorical cross entropy in the pre-trained subnetwork. and are the total number of samples images and classes in a batch, respectively. and are the ground-truth and predicted values, respectively, for ’th sample and ’th class, where, and . Table 1 shows the pseudocode for building the classifier model.Table 1Pseudocode for building the classifier model.
3.Results3.1.Parameters and OptimizationIn our approach, all cropped images after the detection module were resized to with bicubic interpolation, and the image intensities were normalized by dividing them by 255. No additional data augmentation was performed. For the autoencoder network, each subnetwork was trained independently, and the latent representation of the subnetwork was used to train the topper network. There are two fixed parameters for each subnetwork: epochs and batch size, which are set to be 10 and 128, respectively. The default Adadelta optimizer is used for the autoencoder network.31 For RandNet, there are five parameters: epochs, batch size, number of subnetworks, learning rate, and decay rate. We used 25 epochs with early stopping, a batch size of 50, and a total of 33 subnetworks. We also used a default Adam optimizer34 with the learning rate of 0.001. All parameters are fixed except the number of subnetworks, which has a search range from 1 to 37 with a step size of 2. Figure 9 shows that, when the number of subnetworks equals 33, it has the highest average validation accuracy as well as the lowest average validation loss. It should also be noted that the processing speed for our approach using all 33 subnetworks during inference is 6.25 frames per second (FPS) compared with 4.16 FPS using the approach proposed by Theagarajan et al.19 Fig. 9Five-fold cross-validation results. (a) Mean accuracy vs. number of subnetworks curve; (b) mean loss vs. number of subnetworks curve. ![]() Using an ensemble of classifiers is similar to using dropout during training, but they are not the same.35 Ensemble training focuses on training each network with a different subset of data while dropout reduces feature spaces randomly. Although both ensemble method and dropout can generalize the network, the former influences the model with data and the latter manipulates the extracted features. The proposed method uses a simple subnetwork, and each subnetwork was trained independently; therefore, dropout was not considered in each subnetwork. Most importantly, data-driven model preserves all essential features for reconstructing the input image in a simple autoencoder network. Figure 10 shows the comparison of the reconstructed images with and without dropout. It can be seen that when we use dropout the reconstructed images are blurrier due to missing feature information. 3.2.Performance MeasuresFor performance analysis and comparison, we used the confusion matrix for evaluation.36 The following equations show the calculations for the overall and individual classification accuracy from the confusion matrix. The average classification rate and individual true positive rate (TPR) are given by the following equations: It is worth noting that is an ’th element in the confusion matrix . is an element of where is the total number of classes. is the total number of evaluated observations. is the true positive rate/recall for the ’th class. is the total number of samples in the ’th class. is the element of CM in the ’th row and ’th column.There are three different categories of accuracies in evaluating the performance of a model: (1) training accuracies, (2) validation accuracies, and (3) out-of-sample testing accuracy. Training and validation accuracies refer to cross validation accuracies for training and validating sets, respectively. The out-of-sample testing accuracy is slightly different than the validation scheme. Once the best model parameters are learned from the model selection process, the final model is obtained with the entire training dataset and the best parameters. This final model is then used to evaluate the performances of the testing dataset, and it produces the out-of-sample accuracy. Typically, training and validation accuracies show us the estimated bias and variance in the final model while out-of-sample testing accuracy shows the true variance in the final model. 3.3.Experimental ResultsThe proposed RandNet is compared with the state-of-the-art methods as reported in Table 2. The top two performers are the proposed RandNet and the fused CNN triplet.19 The proposed RandNet has 97.23% mean accuracy in a five-fold cross validation and a seemingly low standard deviation in its validation results. The reason that both RandNet and fused CNN triplet outperformed other methods is that additional data are being used. Both aforementioned methods were trained with data other than the given labeled data. The RandNet used unlabeled data to pre-train its subnetworks and then fine-tuned it with the labeled data. On the other hand, fused CNN triplet19 used both synthetic data and real labeled data in training. ResNets,37 VGGs,29 and AlexNet30 were trained with only labeled data. Consequently, they seem to have higher variance in their performances. They also perform worst in out-of-sample testing, as shown in Table 3. Table 2Five-fold cross-validation results.
Table 3Testing data results. 4.DiscussionsWhen comparing with ResNets, VGGs and AlexNet, the proposed method outperformed these methods by at least 6% as shown in Table 3. The performance of these other methods was close within their individual standard deviations. The proposed method has a significantly lower standard deviation than ResNets, VGGs and AlexNet. Therefore, the proposed method still performed better in out-of-sample testing. Since the proposed method incorporated the concept of bagging and used 33 random subnetworks, the proposed method has a low standard deviation. When comparing with fused CNN triplet,19 RandNet outperformed fused CNN triplet in both five-fold cross validation and out-of-sample testing. As shown in Table 2, RandNet was about 2% better than fused CNN triplet in validation results. In terms of out-of-sample testing, the proposed method had a slight 0.45% lead on fused CNN triplet as shown in Table 3. The confusion matrix of the proposed method on the testing dataset is shown in Table 4. The proposed method also outperformed fused CNN triplet of Ref. 19 in terms of training cost. RandNet’s computational cost in training is significantly lower than that of fused CNN triplet. According to Theagarajan et al.,18 fused CNN triplet used an additional 240,000 synthetic images for training, 40,000 for each class. Fused CNN triplet took about a month for synthetic image generation and about four days for final model building. On the other hand, the proposed RandNet had about 5 h of training time, and used only 27,603 unlabeled images for pre-training the encoder network. The proposed method was implemented on a desktop with 3.4 GHz Intel(R) Core i7-3770 CPU and NVIDIA GeForce GTX 1070 GPU. Table 4Confusion matrix for testing data using RandNet.
4.1.Misclassification SamplesThe proposed method had at least 93% TPR/recall for each individual class, as shown in Table 5. It performed better in identifying attached cells, with a total of 98.30% recall. However, it performed worst for unattached cells. Unattached cells are generally easy to identify as shown in Fig. 2(c). Table 5Individual recall for RandNet.
From the typical misclassified images in out-of-sample testing as shown in Fig. 11, we conclude that the blurring effects in the autoencoder network might be the cause for misclassifications. As shown in Figs. 11(b) and 11(c), two unattached cells were blurred out after passing through the autoencoder network. Therefore, these cells looked similar to the attached cells visually. Moreover, this blurring effect might be more significant on the hidden representation generated by the encoder that was used to build the subnetworks. Fig. 11Typical misclassified images in out-of-sample testing: (a) cluster predicted as apoptotic cell; (b) unattached cell predicted as attached cell; (c) unattached cell predicted as attached cell; (d) attached cell predicted as cluster; (e) dynamic blebbing cell predicted as attached cell; (f) dynamic blebbing cell predicted as cluster; (g) debris predicted as apoptotic cell; (h) debris predicted as dynamic blebbing cell. (Note: Recovered images are obtained from the autoencoder network.) ![]() 4.2.Additional ExperimentsWe compared our approach with Mask RCNN;38 our approach achieved a Dice coefficient of 0.86, while Mask RCNN achieved 0.92. To train the Mask RCNN, we used 50% of the data for training. A significant difference between the two approaches is that our approach has only four learnable parameters as described in Ref. 3, while Mask RCNN has 43.9 million learnable parameters. Moreover, the approach proposed by Guan et al. can run on a single Intel i7 CPU3 while a Nvidia 1080Ti GPU is required to train the Mask RCNN model. Additionally, our detection algorithm is completely unsupervised, whereas Mask RCNN is supervised and requires annotated training data. Further, we replaced the segmentation component proposed by Guan et al.3 in our approach with Mask RCNN38 and passed the segmented images as input to our classification component. The classification results and recall for each cell types are shown in Tables 6 and 7, respectively. Table 6Confusion matrix for RandNet using Mask RCNN as the segmentation component.
Table 7Recall of each cell type for RandNet using Mask RCNN as the segmentation component.
As shown in Table 7, the recall for each cell type was above 89%, and the proposed classification component had an accuracy of 93.79% on the Mask RCNN segmented images. Since the proposed classification component was not trained with samples from Mask RCNN, a small accuracy degradation was expected. The proposed classification component still showed good performance reliability on data samples that were not generated by the proposed segmentation method. 5.ConclusionsAutomated classification of hESCs in phase contrast videos is essential for a fast quantifiable analysis of hESC behaviors. The proposed RandNet utilized unlabeled data for pre-training, and it incorporated both transfer and ensemble learning concepts. RandNet not only has lower training cost with pre-trained models, but it also can improve performance through fine-tuning with labeled data. It had low performance variance in the cross validation results. This paper has demonstrated that RandNet is an efficient and effective method. In term of efficiency, it uses the combination of subsampling and pre-trained models to generate subnetworks. In term of effectiveness, it is a robust method that provides a generalized solution for hESC classification. Our objective in this paper has been to show that we can use both labeled and unlabeled datasets. This software enables quantitative analysis of changes in and behavior of hESCs in video. In the future, we will explore additional deep networks for building subnetworks. Since the blurring effects of the current simple network affected classification performance, we will explore deeper networks to learn a finer hidden representation for hESC classification. AcknowledgmentsThis research was supported in part by US National Science Foundation Integrated Graduate Education Research and Training (NSF-IGERT), Video Bioinformatics Grant DGE 0903667; and by Tobacco-Related Disease Research Program (TRDRP), Grant 20XT-0118 and Grant 22RT-0127. ReferencesJ. Nichols and A. Smith,
“The origin and identity of embryonic stem cells,”
Development, 138
(1), 3
–8
(2011). https://doi.org/10.1242/dev.050831 Google Scholar
J. A. Thomson et al.,
“Embryonic stem cell lines derived from human blastocysts,”
Science, 282
(5391), 1145
–1147
(1998). https://doi.org/10.1126/science.282.5391.1145 SCIEAS 0036-8075 Google Scholar
B. X. Guan et al.,
“Bio-driven cell region detection in human embryonic stem cell assay,”
IEEE/ACM Trans. Comput. Biol. Bioinf., 11
(3), 604
–611
(2014). https://doi.org/10.1109/TCBB.2014.2306836 ITCBCY 1545-5963 Google Scholar
Z. Zhu and D. Huangfu,
“Human pluripotent stem cells: an emerging model in developmental biology,”
Development, 140
(4), 705
–717
(2013). https://doi.org/10.1242/dev.086165 Google Scholar
P. Talbot and S. Lin,
“Mouse and human embryonic stem cells: can they improve human health by preventing disease?,”
Curr. Top. Med. Chem., 11
(13), 1638
–1652
(2011). https://doi.org/10.2174/156802611796117621 Google Scholar
S. Lin et al.,
“Comparison of the toxicity of smoke from conventional and harm reduction cigarettes using human embryonic stem cells,”
Toxicol. Sci., 118
(1), 202
–212
(2010). https://doi.org/10.1093/toxsci/kfq241 Google Scholar
P. Talbot et al.,
“Use of video bioinformatics tools in stem cell toxicology,”
Handbook of Nanotoxicology, Nanomedicine and Stem Cell Use in Toxicology, John Wiley & Sons, Ltd.(2014). Google Scholar
B. Bhanu and P. Talbot, Video Bioinformatics – From Live Imaging to Knowledge, Springer(2015). Google Scholar
B. X. Guan et al.,
“Comparison of texture features for human embryonic stem cells with bio-inspired multi-class support vector machine,”
in IEEE Int. Conf. Image Process.,
4102
–4106
(2014). https://doi.org/10.1109/ICIP.2014.7025833 Google Scholar
B. X. Guan et al.,
“Human embryonic stem cell detection by spatial information and mixture of Gaussians,”
in Int. Conf. Healthcare Inf., Imaging and Syst. Biol.,
307
–314
(2011). Google Scholar
“Nikon Biostation-IM,”
http://www.nikoninstruments.com/Products/Live-Cell-Screening-Systems/BioStation-IM Google Scholar
T. D. Miller and P. Elgard,
“Defining modules, modularity and modularization,”
in Proc. 13th IPS Res. Semin.,
(1998). Google Scholar
G. Klushin, C. Fortin and Z. Tekic,
“Modular design guideline for projects from scratch,”
in Ann. DAAAM & Proc.,
(2018). Google Scholar
L. de Aguiar Corrêa, F. I. Kubota and P. A. C. Miguel,
“Towards a contribution to modularity concepts and principal domains,”
Prod.: Manage. Dev., 10
(2), 119
–130
(2017). https://doi.org/10.4322/pmd.2013.006 Google Scholar
H. Niioka et al.,
“Classification of C2C12 cells at differentiation by convolutional neural network of deep learning using phase contrast images,”
Hum. Cell, 31
(1), 87
–93
(2018). https://doi.org/10.1007/s13577-017-0191-9 HUCEE7 Google Scholar
W. Xie, J. A. Noble and A. Zisserman,
“Microscopy cell counting and detection with fully convolutional regression networks,”
Comput. Methods Biomech. Biomed. Eng.: Imaging Vis., 6
(3), 283
–292
(2018). https://doi.org/10.1080/21681163.2016.1149104 Google Scholar
Y. H. Chang et al.,
“Human induced pluripotent stem cell region recognition in microscopy images using convolutional neural networks,”
in IEEE Int. Conf. Eng. Med. and Biol. Soc.,
4058
–4061
(2017). https://doi.org/10.1109/EMBC.2017.8037747 Google Scholar
R. Theagarajan, B. X. Guan and B. Bhanu,
“DeephESC: an automated system for generating and classification of human embryonic stem cells,”
in IEEE Int. Conf. Pattern Recognit.,
3826
–3831
(2018). https://doi.org/10.1109/ICPR.2018.8545356 Google Scholar
R. Theagarajan and B. Bhanu,
“DeephESC 2.0: deep generative multi adversarial networks for improving the classification of hesc,”
PLoS One, 14
(3), e0212849
(2019). https://doi.org/10.1371/journal.pone.0212849 POLNCL 1932-6203 Google Scholar
B. X. Guan et al.,
“Extraction of blebs in human embryonic stem cell videos,”
IEEE/ACM Trans. Comput. Biol. Bioinf., 13
(4), 678
–688
(2015). https://doi.org/10.1109/TCBB.2015.2480091 ITCBCY 1545-5963 Google Scholar
B. X. Guan et al.,
“Automated human embryonic stem cell detection,”
in IEEE Int. Conf. Healthcare Inf., Imaging and Syst. Biol.,
75
–82
(2012). https://doi.org/10.1109/HISB.2012.25 Google Scholar
L. Breiman et al., Classification and Regression Trees, CRC Press(1984). Google Scholar
L. Breiman,
“Random forests,”
Mach. Learn., 45
(1), 5
–32
(2001). https://doi.org/10.1023/A:1010933404324 MALEEZ 0885-6125 Google Scholar
P. Geurts, D. Ernst and L. Wehenkel,
“Extremely randomized trees,”
Mach. Learn., 63
(1), 3
–42
(2006). https://doi.org/10.1007/s10994-006-6226-1 MALEEZ 0885-6125 Google Scholar
J. Morgan, Classification and Regression tree Analysis, Boston University, Boston
(2014). Google Scholar
W. Wang et al.,
“Generalized autoencoder: a neural network framework for dimensionality reduction,”
in IEEE Int. Conf. Comput. Vision and Pattern Recognit. Workshops,
490
–497
(2014). https://doi.org/10.1109/CVPRW.2014.79 Google Scholar
J. E. S. Sklan et al.,
“Toward content-based image retrieval with deep convolutional neural networks,”
Proc. SPIE, 9417 94172C
(2015). https://doi.org/10.1117/12.2081551 PSISDG 0277-786X Google Scholar
Z. Camlica, H. R. Tizhoosh and F. Khalvati,
“Autoencoding the retrieval relevance of medical images,”
in Int. Conf. Image Process. Theory, Tools Appl.,
550
–555
(2015). Google Scholar
K. Simonyan and A. Zisserman,
“Very deep convolutional networks for large-scale image recognition,”
(2014). Google Scholar
A. Krizhevsky, I. Sutskever and G. E. Hinton,
“Imagenet classification with deep convolutional neural networks,”
in Adv. Neural Inf. Process. Syst.,
1097
–1105
(2012). Google Scholar
M. D. Zeiler,
“Adadelta: an adaptive learning rate method,”
(2012). Google Scholar
O. Pons,
“Bootstrap of means under stratified sampling,”
Electron. J. Stat., 1 381
–391
(2007). https://doi.org/10.1214/07-EJS033 Google Scholar
L. Rokach,
“Ensemble-based classifiers,”
Artif. Intell. Rev., 33
(1–2), 1
–39
(2010). https://doi.org/10.1007/s10462-009-9124-7 AIREV6 Google Scholar
S. J. Reddi, S. Kale and S. Kumar,
“On the convergence of adam and beyond,”
(2019). Google Scholar
I. J. Goodfellow et al.,
“Maxout networks,”
in Int. Conf. Mach. Learn.,
1319
–1327
(2013). Google Scholar
D. M. Powers,
“Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation.,”
(2011). Google Scholar
K. He et al.,
“Deep residual learning for image recognition,”
in IEEE Int. Conf. Comput. Vision and Pattern Recognit.,
770
–778
(2016). https://doi.org/10.1109/CVPR.2016.90 Google Scholar
K. He et al.,
“Mask r-cnn,”
in Proc. IEEE Int. Conf. Comput. Vision,
2961
–2969
(2017). Google Scholar
BiographyBenjamin X. Guan received his BS degree with high honor, his MS degree, and his PhD all in electrical engineering from the University of California, Riverside (UCR). He was an NSF IGERT Fellow in the Video Bioinformatics Program at UCR. His research interests include human embryonic stem cell segmentation, detection, and classification. He received the Best Paper Award from the IEEE International Conference on Health Informatics, Imaging and System Biology. Currently he is working with Northrop Grumman Corporation. Bir Bhanu received his SM and EE degrees in electrical engineering and computer science from Massachusetts Institute of Technology, Cambridge, Massachusetts, his PhD in electrical engineering from the University of Southern California, Los Angeles, California, and his MBA from the University of California at Irvine, Irvine, California. He is the founding professor of electrical engineering with the University of California at Riverside (UCR), Riverside, California, and served as its first chair from 1991 to 1994. He is currently the Bourns Endowed University of California Presidential Chair in engineering, distinguished professor of electrical and computer engineering, and the founding director of the Interdisciplinary Center for Research in Intelligent Systems (1998–2019), and the Visualization and Intelligent Systems Laboratory, UCR. He has published extensively and has 18 patents. Prioir to joining UCR, he was a senior Honeywell fellow with Honeywell, Inc. He is a fellow of IEEE, AAAS, IAPR, SPIE, NAI, and AIMBE. His research interests include computer vision, pattern recognition and data mining, machine learning, artificial intelligence, image processing, image and video databases, graphics and visualization, robotics, human-computer interactions, and biological, medical, military, and intelligence applications. Rajkumar Theagarajan received his BE degree in electronics and communication engineering from the Anna University, Chennai, India, in 2014 and his MS degree and PhD in electrical and computer engineering from the University of California, Riverside, California, in 2016 and 2020, respectively. Currently, he is working with KLA Corporation. His research interests include computer vision, pattern recognition, image processing, and machine learning. Hengyue Liu received his BS degree from Beijing University of Posts and Telecommunications, Beijing, China, in 2014 and his MS degree from the University of Southern California, Los Angeles, California, in 2016. He is currently working toward his PhD in electrical and computer engineering at the Center for Research in Intelligent Systems, University of California, Riverside, California. His research interests include object detection, scene graph generation, and mobile vision. Prue Talbot is a professor of cell biology and the director of the UCR Stem Cell Center and Core. Her lab is interested in using stem cells to prevent disease and in the effects of tobacco products on human health, including prenatal development. Some of her recent projects have included working with engineers to develop video bioinformatics tools to study morphological and dynamic changes in stem cells during growth and differentiation under normal and stressful conditions and predicting adverse reactions of cells to chemical treatments. Nikki Weng received his BS degree from Chang Gung University, Taiwan. She received her PhD in cell, molecular, and developmental biology from UC Riverside in 2015. She participated, as a fellow, in the UC Riverside NSF integrated graduated education research and training program (IGERT) on video bioinformatics. Currently, she is a scientist at Irvine Scientific. |