Skip to main content
Log in

Efficient labelling of solar flux evolution videos by a deep learning model

  • Article
  • Published:

From Nature Astronomy

View current issue Submit your manuscript

Abstract

Machine learning is becoming a critical tool for the interrogation of large, complex data. Labelling, defined as the process of adding meaningful annotations, is a crucial step of supervised machine learning. However, labelling datasets is time consuming. Here we show that convolutional neural networks (CNNs) trained on crudely labelled astronomical videos can be leveraged to improve the quality of data labelling and reduce the need for human intervention. We use videos of the solar magnetic field that are divided into two classes—emergence or non-emergence of bipolar magnetic regions (BMRs)—on the basis of their first detection on the solar disk. We train CNNs using crude labels, manually verify, correct disagreements between the labelling and CNN, and repeat this process until convergence is reached. Traditionally, flux emergence labelling is done manually. We find that a high-quality labelled dataset derived through this iterative process reduces the necessary manual verification by 50%. Furthermore, by gradually masking the videos and looking for maximum changes in CNN inference, we locate BMR emergence time without retraining the CNN. This demonstrates the versatility of CNNs for simplifying the challenging task of labelling complex dynamic events.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1: Dataset and model architecture.
Fig. 2: Data management used to build, train, validate and test different models for iterative relabelling, detection threshold estimation and performance evaluation.
Fig. 3: Sequence of iterative relabelling and convergence of performance.
Fig. 4: Model ensemble used to estimate the classification accuracy on the test set.
Fig. 5: Identifying the emergence epoch by frame stacking.
Fig. 6: Emergence epoch identification using an ensemble of models.
Fig. 7: Accuracy of emergence epoch identification.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The SoHO/MDI magnetograms, used to create the flux emergence videos for this study, are available from the Joint Science Operations Center (http://jsoc.stanford.edu/ajax/lookdata.html?ds=mdi.fd_M_96m_lev182). All the flux evolution videos with their emergence or non-emergence labels can be accessed through Harvard Dataverse at https://doi.org/10.7910/DVN/6F25MG. Source data are provided with this paper.

Code availability

The iterative relabelling algorithm has been explicitly depicted in the Methods. The code for data preparation and training the CNN can be accessed in the form of a python notebook via GitHub at https://github.com/subhamoysgit/flux_emergence/.

References

  1. Zhang, Y. & Zhao, Y. Astronomy in the big data era. Data Sci. J. 14, 11 (2015).

    Article  ADS  Google Scholar 

  2. Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Advances in Neural Information Processing Systems Vol. 25 (eds Pereira, F. et al.) 1097–1105 (Curran Associates, 2012).

  3. Settles, B. Active Learning Literature Survey Computer Sciences Technical Report No. 1648 (Univ. Wisconsin–Madison, 2009).

  4. Dubey, G., van der Holst, B. & Poedts, S. The initiation of coronal mass ejections by magnetic flux emergence. Astron. Astrophys. 459, 927–934 (2006).

    Article  ADS  Google Scholar 

  5. Zhang, Y., Zhang, M. & Zhang, H. On the relationship between flux emergence and CME initiation. Sol. Phys. 250, 75–88 (2008).

    Article  ADS  Google Scholar 

  6. Rycroft, M. J. in Handbook of Satellite Applications (eds Pelton J. N. et al.) 1175–1193 (Springer, 2013).

  7. DeForest, C. E., Hagenaar, H. J., Lamb, D. A., Parnell, C. E. & Welsch, B. T. Solar magnetic tracking. I. Software comparison and recommended practices. Astrophys. J. 666, 576–587 (2007).

    Article  ADS  Google Scholar 

  8. Lamb, D. A., DeForest, C. E., Hagenaar, H. J., Parnell, C. E. & Welsch, B. T. Solar magnetic tracking. II. The apparent unipolar origin of quiet-sun flux. Astrophys. J. 674, 520–529 (2008).

    Article  ADS  Google Scholar 

  9. Iida, Y., Hagenaar, H. J. & Yokoyama, T. Detection of flux emergence, splitting, merging, and cancellation of network field. I. Splitting and merging. Astrophys. J. 752, 149 (2012).

    Article  ADS  Google Scholar 

  10. Iida, Y., Hagenaar, H. J. & Yokoyama, T. Detection of flux emergence, splitting, merging, and cancellation of network fields. II. Apparent unipolar flux change and cancellation. Astrophys. J. 814, 134 (2015).

    Article  ADS  Google Scholar 

  11. Iida, Yusuke Tracking of magnetic flux concentrations over a five-day observation, and an insight into surface magnetic flux transport. J. Space Weather. Space Clim. 6, A27 (2016).

    Article  ADS  Google Scholar 

  12. Jiang, H. et al. Identifying and tracking solar magnetic flux elements with deep learning. Astrophys. J. Suppl. Ser. 250, 5 (2020).

    Article  ADS  Google Scholar 

  13. Scherrer, P. H. et al. The Solar Oscillations Investigation – Michelson Doppler Imager. Sol. Phys. 162, 129–188 (1995).

    Article  ADS  Google Scholar 

  14. Muñoz-Jaramillo, A. et al. The best of both worlds: using automatic detection and limited human supervision to create a homogenous magnetic catalog spanning four solar cycles. In 2016 IEEE International Conference on Big Data 3194–3203 (IEEE, 2016).

  15. LeCun, Y., Haffner, P., Bottou, L. & Bengio, Y. in Shape, Contour and Grouping in Computer Vision. Lecture Notes in Computer Science Vol. 1681 (eds Forsyth D. A. et al.) 319–345 (Springer, 1999).

  16. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (2015).

  17. Ramachandran, P., Zoph, B. & Le, Q. V. Searching for activation functions. Preprint at https://arxiv.org/abs/1710.05941 (2017).

  18. Han, J. & Moraga, C. in From Natural to Artificial Neural Computation (eds Mira, J. & Sandoval, F.) 195–201 (Springer, 1995).

  19. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).

    MathSciNet  MATH  Google Scholar 

  20. Ruder, S. An overview of gradient descent optimization algorithms. Preprint at https://arxiv.org/abs/1609.04747 (2016).

Download references

Acknowledgements

This research was funded by NASA grant numbers 80NSSC19M0165 and 80NSSC18K0671.

Author information

Authors and Affiliations

Authors

Contributions

S.C. and A.M.-J. planned the experiments and wrote the paper. S.C. set up and ran the experiments. A.M.-J. provided the list of events that was analysed. D.A.L. assembled the video sequences used in this work and helped edit the manuscript.

Corresponding author

Correspondence to Subhamoy Chatterjee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Astronomy thanks Andong Hu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Source data

Source Data Fig. 1

Raw data for all video frames with header information in FITS format.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data for all of the rows and columns.

Source Data Fig. 4

Statistical source data for creating the plots.

Source Data Fig. 5

Statistical source data for creating the plots.

Source Data Fig. 6

Statistical source data for colouring the video frames.

Source Data Fig. 7

Statistical source data for creating the plots.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chatterjee, S., Muñoz-Jaramillo, A. & Lamb, D.A. Efficient labelling of solar flux evolution videos by a deep learning model. Nat Astron 6, 796–803 (2022). https://doi.org/10.1038/s41550-022-01701-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41550-022-01701-3

  • Springer Nature Limited

Navigation