通过高度参与的方法掌握文本挖掘技能,并学习构建数据产品!欢迎光临!这是一个使用文本挖掘技术精心制作的文本挖掘课程!我来详细说明一下。当我决定教一门文本挖掘课程时,我想知道学生的期望以及他们对当前课程的痛点。什么数据源可以提供这些信息?评论!我开始利用课程复习数据来回答一些与课程内容、学生期望、喜欢/不喜欢以及他们在完成文本挖掘在线课程时的痛点相关的问题。这个练习对我了解像你这样的学生非常有价值,所以我想把它包括在我的课程中。这是一门“先有技能,后有知识”的课程。在本课程中,我们将一起(你和我)进行大量的动手编程,并尽量减少幻灯片的使用!我将仅使用幻灯片来展示一些课程大纲,并展示我们在课程中的进度。Build Text Mining Applications With Live-Coding In Python

我将承担个人责任,确保您获得所需的知识,最重要的是,掌握开始构建和部署文本挖掘应用程序所需的技能。我真的相信这种“技能第一”的方法将会非常吸引你!这不是传统的授课方式!本课程基于现场编码会议来传达文本挖掘的基本思想。我将手动推导每一个概念,并使用在您的学习过程中实现的python程序展示它的工作原理。你可以和我一起实现这些想法,从而对文本挖掘思想有更深的理解,让你能够使用文本挖掘来构建自己的产品。在本课程中,您将从头开始构建一个搜索引擎和文本摘要工具(我们可能会使用一些支持,例如,NLTK库中已经提供了停用词,我们不需要重新发明)。这种深度只有通过牺牲才能达到,别担心,你还不用牺牲你的周末!这只是为了处理文本而牺牲了对流行库的了解——这是我不会在本课程中涉及的内容。这门课程如何传授你所需要的技能?我坚信项目/实践是掌握任何技能的唯一途径,然而,它在教学中却没有得到充分利用!

MP4 |视频:h264,1280×720 |音频:AAC,44.1 KHz
语言:英语|大小:3.7 GB |时长:5小时 34分钟

你会学到什么
了解如何使用文本挖掘来决定本课程的形式和内容
文本挖掘及其应用的介绍和实践方法
通过在Python中实现各种算法来培养文本挖掘技能
通过现场编码学习编程技能,从想法到工作实现
建立一个引导格式的搜索引擎和文本摘要工具
了解开发、构建和部署文本挖掘应用程序的蓝图

要求
Python的基础知识或学习它的意愿
在Linux、MacOS或Windows上建立python开发环境的经验或意愿
愿意通过实践和跟踪实时编码会话来学习新技能
你需要一台能上网的电脑和专门的时间来学习这门课程

描述
本课程有最少的幻灯片演示,将从一开始就完全专注于实践,而不是等待作业和项目结束(因此,本课程没有作业)。这是我所知道的唯一一门使用文本挖掘技术制作的课程——一个很好的真实世界的例子,它展示了文本挖掘的力量,可以直接解决学习文本挖掘课程的学生的偏好。你将从这门课程中学到什么?简介:你会得到该课程的课程结构和教学风格的大致介绍。非结构化数据:您将了解非结构化数据的力量以及处理非结构化数据的挑战。Python编程初级读本:在本课程中,您将学习需要遵循的基本编程结构。您可以使用本节来了解基础知识,为学习高级Python来编写高质量的产品代码做准备。文本挖掘基础:您将学习文本处理的基础知识,使用向量空间模型的文档表示,以及给定查询的文档排序。您将学习用Python实现这些算法。构建搜索引擎:您将使用前一节中的所有实现来构建自己的搜索引擎。您的搜索引擎将被包装成数据服务,以便作为产品进行部署。您还可以选择向您的搜索引擎添加用户搜索界面!部署您的文本挖掘应用程序:您将从一个擅长文本挖掘的学生变成一个专业人员,能够使用您在本课程中学到的文本挖掘技能构建真实世界的应用程序和服务。构建一个文本摘要工具:您将学习基本的文本摘要技术,这些技术对于探索大型文档集合和用Python实现创建标签云的代码是至关重要的。您还将使用NLP在嵌入方面的最新成果对定制课程复习数据进行聚类。谁应该避免学习本课程?我真的很重视你的时间,并希望提前提供课程。希望采用知识优先方法的学生可能不会觉得本课程有价值,也就是说,我不会给出文本挖掘的全面观点,相反,我会更深入地挖掘文本挖掘的基础知识。不喜欢编码和构建系统的学生——在本课程的几乎每个视频中,在解释了关键思想之后,我们将一起编写代码来内化文本挖掘思想。

课程目录:
第一部分:导言

第一讲简介

第2讲结果和排除

第2部分:非结构化数据

第三讲动机

第4讲信息需求

第3部分:Python编程入门

第5讲开发环境设置

第6讲输入/输出处理

第7讲数据结构:列表

第8讲数据结构:字典

第9讲数据结构:数据框架

第10讲数据结构:数据帧操作

第11讲控制结构

第12讲函数和类

第13讲代码组织的实用技巧

第4部分:文本挖掘基础

第14讲电影评论数据集

第15讲信息需求示例

第16讲线性扫描搜索

第17讲索引的概念

第18讲布尔检索:简介

第19讲标记化

第20讲停止单词移除

第21讲词干化和词汇化

第22讲布尔检索:实现

第23讲发布列表

第24讲使用发布列表进行布尔检索

第25讲布尔检索:局限性

第26讲排序检索

第27讲精度和召回

第28讲布尔检索性能测量

第29讲术语频率(tf)

第30讲逆文档频率(idf)

第31讲使用TF-IDF衡量术语权重

第32讲向量空间模型

第33课为查询排列文档

第34讲评估排序检索

第5部分:构建搜索引擎

第35课设计一个搜索引擎

第36讲搜索引擎作为一个烧瓶应用

第37课为查询排列文档

第38讲启动你的搜索引擎

第6节:部署文本挖掘应用程序

第39讲为什么部署?

第40讲部署技术

第41讲集装箱化使用码头工人组成

第42讲使用Mogenius部署

第7部分:使用嵌入的文本摘要

第43讲为什么要总结课文?

第44讲课程复习数据集和单词云

第45讲嵌入

第46讲使用嵌入聚类文本

第47讲生成聚类摘要

第8节:结论

第48讲祝贺完成!

任何希望利用大量非结构化数据来构建自己的产品和服务的人

Master text mining skills with a highly engaging approach and learn to build data products!

What you’ll learn
Learn how Text Mining was used to decide on the format and content of this course
Introduction to Text Mining and its applications with a hands-on approach
Build Text Mining skills by implementing various algorithms in Python
Pickup programming skills through live-coding to go from ideas to a working implementation
Build a Search Engine and Text Summarization tool in a guided format
Learn a blueprint for developing, building, and deploying text mining applications

Requirements
Basic knowledge of Python or willingness to pick it up
Experience or willingness to setting up python development environment on Linux, MacOS or Windows
Willingness to learn new skills by practicing and following through live-coding sessions
You will need a computer with an internet connection and dedicated time to work on the course

Description
Welcome!This is a Text Mining course carefully crafted using Text Mining techniques! Let me elaborate. When I decided to teach a Text Mining course, I was wondering about the student expectations and their pain-points with current courses. What data source can provide this information? Reviews! I started leveraging course review data to answer some of the questions related to course content, student expectations, likes/dislikes, and their pain-points in completing online courses in Text Mining. This exercise was so valuable to my understanding of students like you that I thought of including it in my course. More on this in the course :)This is a “skill first” and “knowledge later” course. In this course, we will do a lot of hands-on coding together (you and I) and minimize use of power-point slides! I will use slides only to show some course outline and show the status as we progress through the course. I would take personal responsibility to ensure you gain the required knowledge and most importantly, master the skills you need to start building and deploying text mining applications.I truly believe that this “skill first” approach will be highly engaging for you!This is not a traditional style of teaching a course! This course is based on live-coding sessions to convey fundamental ideas of text mining. I will derive each and every concept by hand and show it’s working using python programs implemented during the course of your study. You can implement these ideas along with me and thereby gain a deeper sense of text mining ideas empowering you to build your own products using text mining. You will build a search engine and text summarization tool in this course from scratch (we may use some support e.g., stopwords are already available from NLTK library, we need not reinvent it). This level of depth can be achieved only by sacrifices 🙂 Don’t worry, you don’t have to sacrifice your weekends yet! It’s just a sacrifice of learning about popular libraries for processing text — this is something that I will not be covering in this course.How does this course impart the skills you need?I strongly believe that projects/practice is the only way to mastery of any skill and yet, it is so underutilized in teaching! This course has minimal power-point presentations and will focus entirely on practice right from the beginning instead of waiting for assignments and projects at the end (hence, no assignments in this course).This is the only course I know which is crafted using text mining techniques — a great real-world example of the power of text mining to directly address the preferences of students taking text mining courses.What will you learn in this course?Introduction: You will get a general introduction to the course structure and teaching style of the course.Unstructured Data: You will learn about motivational examples of the power of unstructured data and challenges in processing it.Python Programming Primer: You will learn basic programming constructs you need to follow along the course. You can use this section to understand the basics preparing yourself to learn advanced Python to write production quality code.Text Mining Basics: You will learn the basics of text processing, document representation using vector space model, and ranking documents for a given query. You will learn to implement these algorithms in Python.Build a Search Engine: You will build your own search engine using all the implementation you did in the previous section. Your search engine will be wrapped as a data service for potential deployment as a product. You will also have the option of adding a user search interface to your search engine!Deploy your Text Mining Application: You will go from a student skillful in text mining to a professional with skills to build real-world applications and services using text mining skills you have picked up in this course.Build a Text Summarization Tool: You will learn basic text summarization techniques that are crucial to explore large document collection and implement code to create a tag-cloud in Python. You will also use state-of-the-art work from NLP on embeddings to cluster custom course review dataWho should avoid taking this course?I truly value your time and want to be upfront on the course offering.Students expecting a knowledge first approach may not find this course valuable, i.e., I will not present a comprehensive broad view of text mining instead, I will dig deeper into the basics of text miningStudents who don’t prefer to code and build systems — In almost every video in this course, after explaining the key ideas, we will write code together to internalize text mining ideas.

Overview
Section 1: Introduction

Lecture 1 Introduction

Lecture 2 Outcomes and Exclusions

Section 2: Unstructured Data

Lecture 3 Motivation

Lecture 4 Information Need

Section 3: Python Programming Primer

Lecture 5 Development Environment Setup

Lecture 6 Input/Output Handling

Lecture 7 Data Structures: Lists

Lecture 8 Data Structures: Dictionaries

Lecture 9 Data Structures: Dataframes

Lecture 10 Data Structures: Dataframe Operations

Lecture 11 Control Structures

Lecture 12 Functions and Classes

Lecture 13 Practical Tips for Code Organization

Section 4: Text Mining Basics

Lecture 14 Movie Review Dataset

Lecture 15 An Example Information Need

Lecture 16 Search by Linear Scan

Lecture 17 Idea of Indexing

Lecture 18 Boolean Retrieval: Introduction

Lecture 19 Tokenization

Lecture 20 Stop Word Removal

Lecture 21 Stemming and Lemmatization

Lecture 22 Boolean Retrieval: Implementation

Lecture 23 Postings List

Lecture 24 Boolean Retrieval using Postings List

Lecture 25 Boolean Retrieval: Limitations

Lecture 26 Ranked Retrieval

Lecture 27 Precision and Recall

Lecture 28 Boolean Retrieval Performance Measure

Lecture 29 Term Frequency (tf)

Lecture 30 Inverse Document Frequency (idf)

Lecture 31 Scaling term weights with TF-IDF

Lecture 32 Vector Space Model

Lecture 33 Rank Documents for a Query

Lecture 34 Evaluating Ranked Retrieval

Section 5: Build a Search Engine

Lecture 35 Architect a Search Engine

Lecture 36 Search Engine as a Flask Application

Lecture 37 Ranking Documents for a Query

Lecture 38 Launch your Search Engine

Section 6: Deploy your Text Mining Application

Lecture 39 Why Deploy?

Lecture 40 Technologies for Deployment

Lecture 41 Containerization using Docker Compose

Lecture 42 Deploy using Mogenius

Section 7: Text Summarization using Embeddings

Lecture 43 Why Summarize Text?

Lecture 44 Course Review Dataset & Word Cloud

Lecture 45 Embeddings

Lecture 46 Cluster Text using Embeddings

Lecture 47 Generate Cluster Summaries

Section 8: Conclusion

Lecture 48 Congratulations on Completion!

Anyone who wants to leverage vast unstructured data to build their own products and services

发表回复

后才能评论