CV & Publications

Escher -- Drawing Hands

Mishtu S Banerjee — Data Scientist

Complex Systems – Analytics & Predictive Modelling – Risk Management – Network Analytics – Database Design – Resource Inventory Design – Ecoinformatics – Computational & Systems Biology – Python – R

About Me 

How things connect fascinates me. Regardless of the nature of those connections. I am a computer scientist, sci -fi movie buff, ecologist, forester, poet; a specialist in both complex networks and data privacy. I connect to cross boundaries: across science and art, between ecosystem theories and Internet protocols, strategic and tactical views of a business, architecture and implementation details.

I love to build things that work, that are useful to solve business problems, but also call in some elegant design ideas. I am a deeply experienced data scientist. I bring together analytic depth with extensive practical experience in designing and building data driven analytic management systems and software. My background in network science extends my skills in analytics with the ability to model, simulate and integrate data, process and business interconnections so as to make good decisions in rapidly changing environments.

I have practical experience in a wide range of analytical technologies drawn from statistics, computer science, biology, and geospatial systems with an emphasis on predictive modelling, machine learning, network analytics and risk. In addition to being a proficient user of cutting edge techniques, I am also able to innovate, program, and refine new statistical and machine learning algorithms.

As an analytics system architect I am experienced in balancing technical and business requirements, while managing interdisciplinary teams. I have designed and implemented a wide range of systems from comprehensive natural resources data warehouses to predictive climate risk infrastructure. I balance deep creativity and design capability with a drive to turn ideas into working systems. My interdisciplinary background crosses statistics, biology, computer science and network science (the area of my PhD ). I can often ‘connect the dots’ between a problem in one domain, and its solution in another domain.

My love of learning and building systems is infectious and I strive to inspire and mentor teams to combine individual skills into collective performance. My experience in running my own company and consulting and building systems in several domains allows me to be very comfortable balancing technical, financial, and human factors in developing strong analytic teams and effective analytical software.

I am not just a data scientist. I run my own business, lead teams, take products from idea to delivery, translate complex technological ideas for marketting, help mentor growth in people to reach their potential. I love any challenge that provides opportunity to bring together people skills, my analytical, algorithmic, network science and business backgrounds.

Resume

Expertise

I am a highly motivated individual with an extensive background in quantitative analysis to support business decisions. I have built and continue to create data driven predictive analytics and analytic risk management systems that integrate on-the-ground information, geographic information systems (GIS), remote sensing (RS) and climate information. With expertise in complex network theory and analysis, as well as in a wide range of data analysis and modelling techniques, I bring together data, relevant database and mapping technology, statistical modelling, and data visualization techniques to support business strategy and reduce risks. I have analyzed diverse systems from DNA to ecosystems, from the Internet to municipal infrastructure. I am an experienced quantitative analyst, database designer, biologist, project manager and software architect/developer leading mixed teams of data scientists, analysts, programmers, GIS specialists, engineers, biologists and stakeholders.

Relevant Skills

Analytics, Predictive Modelling, Programming and Information Design: Advanced statistics (linear models, nonlinear models, robust/resistant models, logistic regression, multivariate analysis, model diagnostics and automated variable & model selection, neural networks, fuzzy logic, fuzzy clustering, K-means & Hierarchical clustering, graphical models, support vector machines, decision trees, KNN, network analysis, Monte Carlo methods), visual design for reporting, user interface design, algorithm design; code for SQL generators and spatial queries, statistical data processing, pivot table and data cube frameworks; spatial data and database integration.

Proficient in: Python, R, SQL, Matlab and Systat for analytical programming.

Familiar with Knime, Sage, SPSS, SAS, ArcGIS, JAVA, C, VB, SQLite, Postgres/PostGIS, Oracle, Access. See: https://github.com/MishtuBanerjee/xaya

Application Development, Project Management and Mentoring:

Project management and budgeting. Code testing and quality assurance. Liaising with clients to gather requirements and test software user interfaces.

Mentoring programmers, data scientists and GIS specialists on analytics, bioinformatics, ecoinformatics, spatial/database integration and data driven application design using object oriented, relational, and functional approaches.

Mentoring and teaching on data analysis, algorithm development, database design and unit testing. Application of SCRUM, Extreme Programming and BDD development methodologies. Experienced lecturer and mentor, with excellent verbal and written communication skills. See http://datasigns.org/

Environmental Information System Design:

Environmental monitoring survey and sampling techniques, experimental design, forest inventory, growth and yield, ecoinformatics, ecological network analysis. Relational and dimensional (OLAP) design; business intelligence (BI) and data warehousing concepts from design through implementation; requirements and specifications gathering; UML for visual design communication with technical teams and project stake-holders.

Publications

PhD Thesis

Topological Stability and Dynamic Resilience in Networks: https://www.academia.edu/ 11347040/Topological_Stability_and_Dynamic_Resilience_in_Complex_Networks .

Referee

Nonlinear Dynamics, Psychology, and Life Sciences. 2000, 2002

Trends In Ecology and Evolution. 2015.

Books

Banerjee, S.M., K. Creasey and D. Douglas Gertzen. 2001. Native Woody Plant Seed Collection Guide for British Columbia. BC Ministry of Forests. Province of British Columbia. (Online version of the original paperback publication is available at http://www.for.gov.bc.ca/ hti/nativeseedguide/nativeseedguide.htm )

Conference Papers (Refereed and Invited)

Banerjee, M., Karimi, R., Wu, L. and Barker, K. 2011. Quantifying Privacy Violations. SDM 2011, LNCS 6933. pp. 1-17.

Banerjee, S. and J. Kawash. 2009. Re-thinking Computer Literacy in Post-Secondary Education. Proceedings of the 14th Western Canadian Conference on Computing and Education, pp. 17-21.

Barker, K., M. Askari, M. Banerjee, K. Ghazinour, B. Mackas, M. Majedi, S. Pun, and A. Williams. 2009. A Data Privacy Taxonomy. Proceedings of the 26th British National Conference on Databases (BNCOD26), pp. 42-54.

Collier, J. and M. Banerjee. 2000. Benard Cells: A Model Dissipative System. Echo IV. Odense, Denmark.

Journal Papers (Refereed)

Kaharabata, S.K., S.M. Banerjee, M. Keiser, R.L. Desjardins and D. Worth. 2014. Determining the influence of agricultural land use on climate variables for the Canadian Praries. International Journal of Climatology. 34(15):3849-3862.

Maze, J., K. A. Robson, S. Banerjee. 2003. Expanding the view of emergence in individuals, populations and species of Stipoid grasses: A comparison including Achnatherum occidentale. SEED (Semiotics, evolution, energy, development) Vol3-1.

Maze, J., K. A. Robson, S. Banerjee, A. Vyse. 2002. The relationship between growth rate and emergence in seedlings of Picea engelmanii Parry. SEED (Semiotics, evolution, energy, development) Vol1-2.

Maze, J., K. A. Robson, S. Banerjee. 2002. Studies into abstract properties of individuals. VII. Emergence in Hesperostipa comata and three species of Achnatherum (Poaceae). Int. J. Plt. Sci. 163:379-385.

Maze, J., S. Banerjee, K. A. Robson. 2001. Studies into abstract properties of individuals. VI. The degree of emergence in individuals, populations, species and a three species lineage. Biosystems 61:41-54.

Maze, J, K. A. Robson, S. Banerjee. 2000. Studies into abstract properties of individuals. V. An empirical study of emergence in ontogeny and phylogeny in Achnatherum nelsonii and A. lettermanii. SEED (Semiotics, evolution, energy, development) Vol1-1

Maze, J., K. A. Robson and S. Banerjee. 2000. Studies into abstract properties of individuals. IV. Emergence in different aged needle primordia of Douglas fir. BioSystems 56:43-53.

Robson, K., J. Maze, R. K. Scagel, and S. Banerjee. 1993. Ontogeny, phylogeny and intraspecific variation in North American Abies (Pinaceae): An empirical approach to organization and evolution. Taxon 42:17-34.

Maze, J., S. Banerjee, Y.A. El-Kassaby, and L.R. Bohm. 1992. Quantitative genetics of integration in Douglas fir. International Journal of Plant Science 153:333-340.

Maze, J., K.A. Robson, S. Banerjee, L.R. Bohm and R.K. Scagel. l990. Quantitative studies in early ovule and fruit development: Developmental constraints in Balsamorhiza sagittata and B. hookeri. Bot. Gaz. 151:415-422.

El-Kassaby, Y.A., Maze, J., MacLeod, D.A., and Banerjee, S. 1991. Reproductive-cycle plasticity in yellow-cedar (Chamaecyparis nootkatensis). Can. J. Bot. 21:1360 – 1364.

Funk I, P., S. Banerjee and J. Maze. l990. The structure of variation and correlations in Abies amabilis from southwestern British Columbia as assessed through a provenance test. Can. J. Bot. 68:1796-1802.

Banerjee, S., Sibbald, P.R. and Maze. J. l990. Quantifying the dynamics of order and organization in biological systems. Jour. Theor. Biol. 143: 9l-lll.

Sibbald, P.R., S. Banerjee, and J. Maze. l989. Calculating higher order DNA sequence information measures. Jour. Theor. Biol. l36: 475-483.

Maze, J. and S. Banerjee. 1989. A comparison of variation between Pseudotsuga menziesii seedlings from genetically defined and undefined sources. Can. J. Bot. 67: 945-947.

Maze, J., S. Banerjee and Y.A. El-Kassaby. 1989. Variation in growth rate within and among full-sib families of Douglas fir (Pseudotsuga menziesii (Mirbs. Franco). Can. J. Bot. 67: l40- l45.

Banerjee, S. and J. Maze. 1988. Variation in growth within and among families of Douglas fir through a single season. Can. J. Bot. 66: 2452-2458.

Invited Talks (Post 2012)

One Model to Rule Them All. A Brief History of The Diversity Stability Debate. April 2013. University of Alberta, Centre for Mathematical Biology.

Technical Workshops Conducted

Database Design. Clients include: Lignum Ltd, Northern Interior Vegetation Management Association, UBC Computer Science. Internal workshops for Tesera.

Data Analysis/Data Mining. Clients include: Cognos, Alberta Advanced Education and Career Development, Ontario Ministry of Transport, Northern Interior Vegetation Management Association, S.I. Systems. Internal workshops for Tesera.

Seed Biology and Propagation. Clients include BC Ministry of Forests, Forestry Canada, and numerous British Columbia forest nurseries.

Selected Industry Technical Reports

CLIMATE CHANGE

Kaharabata, S., M. Banerjee and M. Keizer. 2013. Determining the Influence of Agricultural Land Use on Climatic Variables for the Canadian Praries. Tesera Systems Inc. for AAFC, Agriculture Canada.

Kaharabata, S., M. Banerjee and I. Moss. 2015. Frequency and Intensity of Temperature Extremes During Critical Crop Growth Stages: Downscaling GCM via Bias Correction. Tesera Systems Inc. for AAFC, Agriculture Canada.

INFORMATION SYSTEMS

Moss, I. and S.M. Banerjee. 2006. The XAYA Stand Structure Compiler Operating Manual. ForesTree Dynamics Ltd. and Harmeny Systems Ltd. For Canadian Wildlife Service.

Banerjee, S.M. and T. Mitchell. 2004. Every Map Tells A Story. Proceedings of Open Source GIS Conference 2004. http://www.omsug.ca/osgis2004/proceedings.html.

Banerjee. S.M. 2002. LifeLine AI 2002 Tutorial. Version 3.1 for Java. Harmeny Systems Ltd. http://www.harmeny.com/twiki/bin/view/Main/LifeLine For Harmeny clients using LifeLine (online manual).

Banerjee, M. 1999. Exploring LifeLine — ACT98. Version 2.0 User Manual. Scientificals Consulting, Richmond, BC. For Northern Interior Vegetation Management Association.

Banerjee, M. 1997. Tree Growth Actuary Tables to Support Silvicultural Decision Making. An Actuary Table User’s Guide, Version 1.0. Rustad Bros. & Co. Ltd., Prince George, BC.

Banerjee, M., A. Brackley, G. Lee and I. Moss. 1996. A Single Tree Approach to Estimated Free to Grow. Decision Support Systems Derived From Original Northern Interior Vegetation Management Association (NIVMA) Database. Northwood Pulp and Timber Ltd., Rustad Bros. & Co. Ltd., Prince George, B.C.

BIOLOGICAL SYSTEMS

Banerjee, M. 1997. BEC for Beginners. Native Plant Society of BC Newsletter. 1:8-9.

Banerjee, M., K. Creasey. 1994. Problem seedlots …. An outline of Information and options. Seed and Seedling Extension Topics. 7:1

Banerjee, M. 1994. A presowing treatment for interior spruce and lodgepole pine. Seed and Seedling Extension Topics. 7:2

Banerjee, M. and B. Molitor. 1993. Factors to consider when accessioning seed. Bioline. 11,2:18-20 Creasey, K.R., T. Myland, BSP Wang, M.

Banerjee, M, B. Downie, and T.L. Noland. 1992. Guidlines for Seed Pretreatment. Ontario Ministry of Natural Resources.

Banerjee, M., Scagel RK. 1992. Nursery Survey of density separation processed (DSP) spruce seed at British Columbia forest nurseries. Nursery results and implications. BC MOF Silviculture Branch, Tree Seed Centre.

Banerjee, M., MK Larsen and D. Kolotelo:1991. Density Separation Processing User’s Manual. BC MOF Silviculture Branch, Tree Seed Centre.

Banerjee, M., M.K. Larsen and R. Scagel. 1991. From Seed to Seedling. New processing techniques. Seed and Seedling Extension Topics. 4:16-18

Edwards, DG, and M. Banerjee. 1989. Prospects for IDS Improvement of Seed Quality. BC MOF/ Forestry Canada. FRDA Memo #115.

Portlock, F.T. 1996. A Field Guide to Collecting Cones of British Columbia Conifers. BC Tree Seed Dealers Association. Chaper Authors: Frank Barnard, Peter Hellenius, Don Pigott, Paulus Vrijmoed. Chapter Contributors. Mishtu Banerjee, Rob Bennett, Don Summers.

Leave a comment