Python and Machine-Learning for Asset Management with Alternative Data Sets

EDHEC Business School via Coursera

Go to Course: https://www.coursera.org/learn/machine-learning-asset-management-alternative-data

Introduction

**Course Review: Python and Machine-Learning for Asset Management with Alternative Data Sets on Coursera** In today's rapidly evolving financial landscape, traditional data sources—like market and accounting metrics—are no longer sufficient for securing a competitive edge. With the rise of portfolio crowding and systemic risks, financial institutions are increasingly turning to alternative data to refine their investment strategies. Enter Coursera's "Python and Machine-Learning for Asset Management with Alternative Data Sets," a timely and comprehensive course designed for finance professionals and data enthusiasts alike. **Course Overview** The course seeks to unravel the potential of alternative data by introducing the core concepts, current research, and practical applications in asset management. It covers four major modules, each focusing on a distinct aspect of alternative data analytics, all while utilizing Python as the main programming vehicle for implementation. The unique approach of the course lies in its emphasis on real-world applications, moving beyond theory to provide actionable insights that can be directly applied in the finance sector. **Syllabus Breakdown** 1. **Consumption Module:** This initial module delves into consumption-based alternative data. Participants will explore various datasets—from geolocation metrics to credit card transaction logs—and learn how to aggregate consumer purchasing behavior to predict company performance before earnings announcements. The blend of theory and hands-on data analytics provides learners with the foundational skills necessary to leverage consumption data effectively. 2. **Textual Analysis for Financial Applications:** This module serves as an introduction to text mining. Students will learn key techniques such as web scraping, vectorization of text using the bag of words method, and the application of TF-IDF for filtering noise from datasets. By transforming textual information into quantifiable insights, participants will gather critical market understanding and visualizations that support data-driven financial decision-making. 3. **Processing Corporate Filings:** Here, the course takes a deep dive into analyzing prolific corporate documents like 10-K and 13-F filings using Python. Often overwhelming for individual analysts due to their scale, this module demystifies these documents by providing a structured approach to quantitative analysis. Participants will learn automated methods of extracting and analyzing corporate data, thus equipping them to discern trends and patterns over time. 4. **Using Media-Derived Data:** The final module introduces sentiment analysis and network analysis, focusing on how sentiment can influence market perception and firm performance. Through sentiment analysis of social media and corporate communications, combined with network analysis of corporate interactions, students will harness the power of media-derived data for comprehensive market insights. The lab sessions encourage active engagement and reinforce learning through practical application. **Recommendation** Given the wealth of knowledge presented in "Python and Machine-Learning for Asset Management with Alternative Data Sets," I wholeheartedly recommend this course for anyone looking to enhance their understanding of alternative data in finance. Whether you're a financial analyst, portfolio manager, or data scientist, this course offers invaluable skills that can significantly impact the investment decision-making process. The blend of theory, practical lab sessions, and real-world case studies makes this course particularly beneficial. Not only do learners emerge with a deeper insight into market dynamics, but they also possess actionable skills in Python, data analytics, and financial modeling using alternative datasets. By the end of the course, participants will not only be equipped with a robust toolkit for data analysis but will also gain a strategic outlook that embraces the possibilities that alternative data sets can provide in modern asset management. If you're ready to step up your finance game and embrace the future of asset management, look no further—sign up for this course on Coursera today!

Syllabus

Consumption

The consumption module introduces students to the basics of consumption-based alternative data. By aggregating online and offline consumer purchase activity and behavioral datasets including geolocation data (e.g., cell locations, satellite imagery etc.), transaction data (e.g., credit card transaction logs and point of sale data), as well as consumer interaction with brands and products on social media, researchers can learn about company performance ahead of official company earning announcements. Such information may be extremely useful and can provide investment and risk management advantages. This module reviews the theoretical aspects of various consumption datasets, and provides practical demonstrations of relevant data analytics.

Textual Analysis for Financial Applications

Module 2 is an introduction to text mining as well as a demonstration of how to get from data retrieval (web scraping) to financial market insights. Some of the classic text mining methodologies are covered such as vectorization of text (the bag of words approach), stop words for filtering, and term frequency-inverse document frequency (TF-IDF). Students will learn how text can be mathematically represented, and regularized/filtered to reduce noise. Measures of text-similarity will be covered in theoretical and practice sessions. Lab sessions go through examples of web scraping data, regularizing with the described techniques and finally, insights will be derived from the textual data.

Processing Corporate Filings

Module 3 is a practical extension of the text mining lessons to 10-K and 13-F, two of the most commonly researched corporate filings. This type of data can be extremely daunting when used by individual analysts due to the sheer size of the documents, but module 3 describes the methodologies for quantitatively analyzing these documents with Python code. Both the 10-K and 13-F documents are worked through, and within the lab sessions it is demonstrated how one can automatically pull this kind of data as well as define metrics around them. We investigate implementations of research in this field around similarity of given companies 10-K statements over time as well as similarity between fund holdings from the 13-F in the lab.

Using Media-Derived Data

The final module introduces both sentiment analysis in the context of textual data as well as network analysis in the context of connectivity of firms. Sentiment analysis is an avenue of potentially fruitful information that when done correctly can display what a general population might believe about a company (through for example social media) or even whether the company itself is positive or negative on future outlook (through analysis of tone in corporate filings). Network analysis, as shown in the research of course instructors and his colleagues, can be used to accurately capture how a financial network is oriented and what companies might perform well because of other firm’s mentioning them as a threat. The lab session of this module extends the corporate filings analysis to examine sentiment while also introducing a set of tweets which are then transformed into a network representation.

Overview

Over-utilization of market and accounting data over the last few decades has led to portfolio crowding, mediocre performance and systemic risks, incentivizing financial institutions which are looking for an edge to quickly adopt alternative data as a substitute to traditional data. This course introduces the core concepts around alternative data, the most recent research in this area, as well as practical portfolio examples and actual applications. The approach of this course is somewhat unique

Skills

Advanced vizualisation Basics of consuption-based alternative data Text mining methodologies Web-scritpting tools

Reviews

Great lab sessions and very well explained theory. Delivers strong intuition to the student.

Excellent view into modern financial research in the use of alternative data sets including valuable demonstration in implementation.

Different from the other 3 courses but extremely interesting

Good course with great practical content and insights into alternative data sets. I would have liked to see some more involved textual analysis techniques.

Learnt many use cases where machine learning is applied in Finance & Investment domain