Livebook Youtube Summarizer

nikiljos · February 25, 2024, 9:36am

Hello,
I recently attempted to revise for an Operating System CA, but I found myself overwhelmed by the sheer volume of YouTube videos available in the livebooks. Rewatching them all before the test seemed impossible. Seeking assistance, I turned to ChatGPT, but the task of downloading transcripts for all the videos and providing them to ChatGPT was daunting due to their large number.

This led me to consider automating the process. I experimented with writing a small script that takes the livebook module names, queries the Livebook APIs to identify relevant YouTube videos, retrieves subtitles from these videos using their IDs, and then passes the transcripts to OpenAI APIs for summarization and generating multiple-choice questions (MCQs). The script works seamlessly.

Also, OpenAI recently reduced the cost of their gpt-3.5-turbo model with the introduction of gpt-3.5-turbo-0125 model and hence this became really affordable as well .

Check out the code and results here.
Please do drop a star on my repo as well if you liked it .

anilgulecha · February 26, 2024, 5:30am

This is cool, @nikiljos .

How long does it take to run for 1 LU?

2 key enhancements:

1 - run for all

You can run this for every livebook, and the rendering can be in a folder structure: Course/ Module-01-unit-01-name-of-learningunit.md …

Since you’ve looked at the JSON of the livebooks API, should be straightforward to parse the course/module/LU/lesson structure.

Post that, this can be run one time for all of the LUs, and the summarizer as well, and check them into the repo. Should run for an hour and

If you’re able to get above working, I’ll sponsor access to GPT4 api, so we can run it with that and get higher quality summary.

2 - build a PDF/epub.

There’s gitbook and similar that build out PDF/epub book given a set of .md files.

anilgulecha · February 26, 2024, 5:30am

In the spirit of FOSS software development, will also open github issues for both of above.