This lesson picks up more of the previous notebook and expands on a few pretty cool concepts. It also goes through two recent papers — about disillation, which was pretty cool re doing diffusion in fewer steps, and another image editing paper. I didn't realize how many paper walkthrough videos there were on YouTube, and while I certainly want to get better at reading these myself, it's great to know there's a ton of stuff online to consult.
The rest of the video begins walking through various insides of the pipeline. It's neat to actually see how much noise is being scheduled at different points. Perhaps most interesting was to start working with tokenizers again, which we did in teh first half of the course. There's some fun experiments we did here, particularly re replacing a token with an embedding for two concepts. I got some really interesting results here, i.e. a mix between a keyboard and a typewriter:
I also liked some of the image guidance stuff about using different prompts with the same general shape of image, although my results were less good. You can see my full results in the notebook here
I'm increasingly starting to feel like I have real control over what's going on, especially as we start to play with more of the textual inversion stuff / manipulating embeddings directly.
I generally just followed along with the notebook, because I didn't feel like there was too much to modify. Perhaps the coolest, also, was seeing how you could define your own loss function that would go into the diffusion process, i.e. greatly encouraging the image to be green based on pixel values. Here was one of my results:
One of Jeremy's intructions, to prove you really understood it, was to implement negative prompting. I did that here:
TKTKTKTK
I also felt like this was a good time to (a) read the Lillian Weng diffusion post and (b) the original Denoising Diffusion Probabilistic Models
TKTKTKTK
One of the principles of the second half of the course is that we will derive a whole diffusion model after just starting with the standard Python library. Jeremy starts working through this notebook - I followed along in my own TKTKTKTKTK