1. Chopra, Sunil, and Peter Meindl. , “Supply chain management”. Strategy, planning & operation. Gabler, 2007.##[2] Bichler,M. and Kalagnanam,J., “A nonoparametric estimator for setting: reserve prices in procurement auctions”. ACM Conference on Electronic Commerce 2003: 254-255, 2003.##[3] Chen, S.L. and M.M. Tseng., “A Negotiation-Credit-Auction Mechanism for Procuring Customized Products”. International Journal of Production Economics, 127(1): 203-210, 2010.##[4] Padgham, Lin, and Michael Winikoff., “Developing intelligent agent systems: A practical guide”. Vol. 13. Wiley, 2005.##[5] Beam, C., and Segev, A., “Automated negotiations: A survey of the state of the art”. Wirtschaftsinformatik 39(3), 263-268, 1997.##[6] Sutton, R.S., editor., “Reinforcement Learnng”. Kluwer Academic Press, Boston, MA, 1992.##[7] Chaharsooghi, S. K., J. Heydari, et al., “A reinforcement learning model for supply chain ordering management: An application to the beer game”. Decision Support Systems 45(4): 949-959, 2008.##[8] Li, X., Wang, J., & Sawhney, R., “Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems”. European Journal of Operational Research, 221(1), 99-109, 2012.##[9] Iima, H., Kuroe, Y., “Swarm Reinforcement Learning Algorithm Based on Particle Swarm Optimization Whose PersonalBests Have Lifespans”. Neural Information Processing, Springer Berlin Heidelberg. 5864: 169-178, 2009.##[10] Giannoccaro, I. and P. Pontrandolfo., “Inventory management in supply chains: a reinforcement learning approach”. International Journal of Production Economics 78(2): 153-161, 2002.##[11] Tang, H., Xu, L., Sun, J., Chen, Y., & Zhou, L., “Modeling and optimization control of a demand-driven, conveyor-serviced production station”. European Journal of Operational Research, 243(3), 839-851, 2015.##[12] Fu, J., & Fu, Y., “An adaptive multi-agent system for cost collaborative management in supply chains”. Engineering Applications of Artificial Intelligence, 44, 91-100, 2015.##[13] Mortazavi, A., Khamseh, A. A., & Azimi, P., “Designing of an intelligent self-adaptive model for supply chain ordering management system”. Engineering Applications of Artificial Intelligence, 37, 207-220, 2015.##[14] Russell,S and Peter Norvig., “Artficial Intelligence: A Modern Approach”. Prentice-Hall, Saddle River, NJ, 1995.##[15] Sutton, R.S., Barto, A.G., “Reinforcement Learning”. MIT Press, Cambridge,1998.##[16] Watkins, C. J. C. H., “Learning from delayed rewards” (Doctoral dissertation, University of Cambridge), 1989.##[17] Tsitsiklis, J. N., “Asynchronous stochastic approximation and Q-learning”. Machine Learning, 16(3), 185-202, 1994.##[18] Kennedy, J., Eberhart, R.C., “Swarm Intelligence”. Morgan Kaufmann Publishers, San Francisco, 2001.##[19] Talbi,E., “Metaheuristics: From Design to Implementation”. ISBN: 978-0-470-27858-1, 2009.##[20] Abdulhai, B., Pringle, R. and Karakoulas, G.J., “Reinforcement learning for ITS: Introduction and a case study on adaptive traffic signal control”. Transportation Research Board 80th Annual Meeting, Washington, D.C, 2001.##