A Dozen Tricks with Multitask Learning

Caruana, Rich

doi:10.1007/3-540-49430-8_9

Rich Caruana⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1524))

5757 Accesses
11 Citations

Abstract

Multitask Learning is an inductive transfer method that improves generalization accuracy on a main task by using the information contained in the training signals of other related tasks. It does this by learning the extra tasks in parallel with the main task while using a shared representation; what is learned for each task can help other tasks be learned better. This chapter describes a dozen opportunities for applying multitask learning in real problems. At the end of the chapter we also make several suggestions for how to get the most our of multitask learning on real-world problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On the Relationship Between Disentanglement and Multi-task Learning

Transfer of Knowledge Across Tasks

Inductive Transfer

References

Y. S. Abu-Mostafa, “Learning from Hints in Neural Networks,” Journal of Complexity, 1990, 6(2), pp. 192–198.
Article MATH MathSciNet Google Scholar
Y. S. Abu-Mostafa, “Hints,” Neural Computation, 1995, 7, pp. 639–671.
Article Google Scholar
J. Baxter, “Learning Internal Representations,” COLT-95, Santa Cruz, CA, 1995.
Google Scholar
J. Baxter, “Learning Internal Representations,” Ph.D. Thesis, The Flinders Univeristy of South Australia, Dec. 1994.
Google Scholar
R. Caruana, “Multitask Learning: A Knowledge-Based Source of Inductive Bias,” Proceedings of the 10th International Conference on Machine Learning, ML-93, University of Massachusetts, Amherst, 1993, pp. 41–48.
Google Scholar
R. Caruana, “Multitask Connectionist Learning,” Proceedings of the 1993 Connectionist Models Summer School, 1994, pp. 372–379.
Google Scholar
R. Caruana and D. Freitag, “Greedy Attribute Selection,” ICML-94, 1994, Rutgers, NJ, pp. 28–36.
Google Scholar
R. Caruana, “Learning Many Related Tasks at the Same Time with Backpropagation,” NIPS-94, 1995, pp. 656–664.
Google Scholar
R. Caruana, S. Baluja, and T. Mitchell, “Using the Future to “Sort Out” the Present: Rankprop and Multitask Learning for Medical Risk Prediction,” Advances in Neural Information Processing Systems 8, (Proceedings of NIPS-95), 1996, pp. 959–965.
Google Scholar
R. Caruana, and V. R. de Sa, “Promoting Poor Features to Supervisors: Some Inputs Work Better As Outputs,” NIPS-96, 1997.
Google Scholar
R. Caruana, “Multitask Learning,” Machine Learning, 28, pp. 41–75, 1997.
Article Google Scholar
R. Caruana, “Multitask Learning,” Ph.D. thesis, Carnegie Mellon University, CMU-CS-97-203, 1997.
Google Scholar
R. Caruana and J. O’Sullivan, “Multitask Pattern Recognition for Autonomous Robots,” to appear in The Proceedings of the IEEE Intelligent Robots and Systems Conference, (IROS’98), Victoria, 1998.
Google Scholar
R. Caruana and V. R. de Sa, “Using Feature Selection to Find Inputs that Work Better as Outputs,” to appear in The Proceedings of the International Conference on Neural Nets, (ICANN’98), Sweden, 1998.
Google Scholar
G. F. Cooper, C. F. Aliferis, R. Ambrosino, J. Aronis, B. G. Buchanan, R. Caruana, M. J. Fine, C. Glymour, G. Gordon, B. H. Hanusa, J. E. Janosky, C. Meek, T. Mitchell, T. Richardson, and P. Spirtes, “An Evaluation of Machine Learning Methods for Predicting Pneumonia Mortality,” Artificial Intelligence in Medicine 9, 1997, pp. 107–138.
Google Scholar
M. Craven and J. Shavlik, “Using Sampling and Queries to Extract Rules from Trained Neural Networks,” Proceedings of the 11th International Conference on Machine Learning, ML-94, Rutgers University, New Jersey, 1994, pp. 37–45.
Google Scholar
I. Davis and A. Stentz, “Sensor Fusion for Autonomous Outdoor Navigation Using Neural Networks,” Proceedings of IEEE’s Intelligent Robots and Systems Conference, 1995.
Google Scholar
T. G. Dietterich and G. Bakiri, “Solving Multiclass Learning Problems via Error-Correcting Output Codes,” Journal of Artificial Intelligence Research, 1995, 2, pp. 263–286.
MATH Google Scholar
M. J. Fine, D. Singer, B. H. Hanusa, J. Lave, and W. Kapoor, “Validation of a Pneumonia Prognostic Index Using the MedisGroups Comparative Hospital Database,” American Journal of Medicine, 1993.
Google Scholar
Ghosn, J. and Bengio, Y., “Multi-Task Learning for Stock Selection,” NIPS-96, 1997.
Google Scholar
T. Heskes, “Solving a Huge Number of Similar Tasks: A Combination of Multitask Learning and a Hierarchical Bayesian Approach,” Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin, pp. 233–241, 1998.
Google Scholar
L. Holmstrom and P. Koistinen, “Using Additive Noise in Back-propagation Training,” IEEE Transactions on Neural Networks, 1992, 3(1), pp. 24–38.
Article Google Scholar
G. John, R. Kohavi, and K. Pfleger, “Irrelevant Features and the Subset Selection Problem,” ICML-94, 1994, Rutgers, NJ, pp. 121–129.
Google Scholar
D. Koller and M. Sahami, “Towards Optimal Feature Selection,” ICML-96, Bari, Italy, 1996, pp. 284–292.
Google Scholar
Y. Le Cun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackal, “Backpropagation Applied to Handwritten Zip-Code Recognition,” Neural Computation, 1989, 1, pp. 541–551.
Article Google Scholar
Y. Le Cun, private communication, 1997.
Google Scholar
P. W. Munro and B. Parmanto, “Competition Among Networks Improves Committee Performance,” to appear in Advances in Neural Information Processing Systems 9, (Proceedings of NIPS-96), 1997.
Google Scholar
D. A. Pomerleau, “Neural Network Perception for Mobile Robot Guidance,” Doctoral Thesis, Carnegie Mellon University: CMU-CS-92-115, 1992.
Google Scholar
L. Y. Pratt, J. Mostow, and C. A. Kamm, “Direct Transfer of Learned Information Among Neural Networks,” Proceedings of AAAI-91, 1991.
Google Scholar
T. J. Sejnowski and C. R. Rosenberg, “NETtalk: A Parallel Network that Learns to Read Aloud,” John Hopkins: JHU/EECS-86/01, 1986.
Google Scholar
J. Sill and Y. Abu-Mostafa, “Monotonicity Hints,” to appear in Neural Information Processing Systems 9, (Proceedings of NIPS-96), 1997.
Google Scholar
S. C. Suddarth and A. D. C. Holden, “Symbolic-neural Systems and the Use of Hints for Developing Complex Systems,” International Journal of Man-Machine Studies, 1991, 35(3), pp. 291–311.
Article Google Scholar
S. C. Suddarth and Y. L. Kergosien, “Rule-injection Hints as a Means of Improving Network Performance and Learning Time,” Proceedings of EURASIP Workshop on Neural Nets, 1990, pp. 120–129.
Google Scholar
S. Thrun, Explanation-Based Neural Network Learning: A Lifelong Learning Approach, 1996, Kluwer Academic Publisher.
Google Scholar
S. Thrun and L. Pratt, editors, Machine Learning. Second Special Issue on Inductive Transfer, 1997.
Google Scholar
S. Thrun and L. Pratt, editors, Learning to Learn, Kluwer, 1997.
Google Scholar
R. Valdes-Perez and H. A. Simon, “A Powerful Heuristic for the Discovery of Complex Patterned Behavior,” Proceedings of the 11th International Conference on Machine Learning, ML-94, Rutgers University, New Jersey, 1994, pp. 326–334.
Google Scholar
A. Weigend, D. Rumelhart, and B. Huberman, “Generalization by Weight-Elimination with Application to Forecasting,” Advances in Neural Information Processing Systems 3, (Proceedings of NIPS-90), 1991, pp. 875–882.
Google Scholar

Download references

Author information

Authors and Affiliations

Just Research and Carnegie Mellon University, 4616 Henry Street, Pittsburgh, PA, 15213
Rich Caruana

Authors

Rich Caruana
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Willamette University, Salem, OR, 97301, USA
Genevieve B. Orr
GMD First (Forschungszentrum Informationstechnik), Rudower Chaussee 5, D-12489, Berlin, Germany
Klaus-Robert Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Caruana, R. (1998). A Dozen Tricks with Multitask Learning. In: Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 1524. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49430-8_9

Download citation

DOI: https://doi.org/10.1007/3-540-49430-8_9
Published: 28 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65311-0
Online ISBN: 978-3-540-49430-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics