The First Chapter of Content Creation with GenAI
Recently, I have been thinking a lot about the essence of content creation and the impact of GenAI on content creation. The long post sums up all my thoughts; I hope you will enjoy it.
The year 2025 so far has marked a few trendy moments in GenAI content creation. In March, the "Studio Ghibli" style image generation from OpenAI swept the internet. In late May, Google publicly released Veo 3 for short video generation, which quickly went viral. What makes Veo 3 stand out is not just its better instruction following and video quality, but its capability of generating audio - voice, music, etc - that goes smoothly with the video, something previous models cannot do.
Out of curiosity and excitement, I created this music video called "we love python", by connecting 4 videos created with Veo 3 together:
These videos were generated with very simple prompts:
"A programmer singing a silly song about python while playing guitar in his bedroom."
"A large group of student programmers from all over the world singing a silly song about python on the stage while guitars & drums are playing."
For a minute or two, I felt quite proud of what I created. I even uploaded it to X and quickly got one repost and one heart. I consider it a huge personal success because, with my mere 9 zombie followers, my occasional posts in the past never got any engagement.
But I quickly realized how cheap my "creation" was. There were no unique ideas or narratives in my prompts. I didn’t even write the lyrics for the song snippets. My creation has little to do with myself. In fact, I should probably call the video I generated "DIY consumption" instead of creation. In a near future world where everyone has access to video generation tools, would my video have a chance to get one heart or repost? I highly doubt that.
The trend of Studio Ghibli style images quickly died out. One-prompt vibe-coding demos are only good for a one-time showoff of new LLM’s capabilities. Once the novelty effect fades away, any cheap creation, no matter what fancy tools are used, has no better chances than random to attract short term attention, and it surely doesn’t create enduring value.
The question then is, what value GenAI provides to creation, if any? To answer that question, we will need to understand the essence of creation first.
Creation is about Self-Expression and Fulfillment
When I travel to different cities, I love to pause by the roadside or on the overpasses to watch and listen to street artists' performances. Most of the time, the drawings they create aren't top-tier, their guitar playing isn't particularly outstanding, and their voices are worlds apart from the sound quality in CDs. However, I enjoy watching their fingertips skillfully glide across the canvas or strings, and I like gazing at the expressions on their faces as they immerse themselves in their works. They use drawings, guitars, and songs to tell their own stories. Their self-expression makes me feel the existence of a unique and interesting soul nearby, which is deeply satisfying.
We don’t always need to watch the process of a work of art being created in order to feel the soul. One can feel the soul behind all the great works by just consuming the works themselves, even though a backstory always makes the work more fascinating. Through self-expression, creators show their creativity, lay bare their experiences and perspectives in front of their audience, defying the pull of mediocrity.
Creation needs to be appealing to the audience in order to sustain, but for creators, they must enjoy the process of creation, which, at the highest level, gives them the sense of fulfillment. This process of cultivating fulfillment is like solving a puzzle. You know what to expect when the puzzle is solved and you are thrilled by the goal itself. You take the effort to search for missing pieces and try different pieces out, but you can see a connection between your efforts and your progress. Finally, all the pieces come together, and the joy that comes with it, is the sense of fulfillment.
Where GenAI Will Change Creation
GenAI won’t help you much in coming up with unique, in-depth stories because fundamentally, they come from who you are, what you have experienced in your life, how much passion you have in creation and how much effort you have invested. However, uniqueness and depth is not sufficient for successful creation; you also need to master the technique of self expression, and you need the time and money to make it happen.
And that’s where GenAI tools can help. Even if they are unlikely to be a shortcut to true mastery of technique, if you have a great story to tell, an “average” level of technique might just be enough to make it successful. Anyway, history is full of great works of literature which succeeded not because of the technique, but because of the uniqueness or depth of the expression.
But if the benefit of GenAI is reducing the need for learning techniques or making those who have the techniques more efficient, the impact on creation would be pretty limited. Lots of forms of creation today - digital images, writings, etc - are already very low cost and highly democratized. There is little room to further lower the barrier in order to attract more talents, and thus leaving little room for improving quality supply. In fact, in areas where creation is already highly democratized, GenAI is more likely to create a race to the bottom, by flooding the market with increasingly cheaper and lower quality supplies.
The real potential of GenAI, then, would be to democratize those art forms that are currently way too expensive for individual or small businesses to create, or art forms that barely exist today because they are too expensive to be profitable. The filmmaking industry is a great example. The high cost of hiring actors and the other crew, selecting scenes and creating visual effects, make it a highly centralized industry. Lots of great stories, long and short, couldn't be filmed because of the cost and restricted access to resources.
Democratization is a big deal. The 15th century reinvention of movable type printing in Europe, combined with low cost production of paper and ink, greatly democratized the access to knowledge, literacy and publication of opinions, which fueled the Renaissance and Scientific Revolution. The invention of the internet further made knowledge access and publication almost free for every individual. Right now, I am able to write this piece for anyone with access to the internet, right because of the magic of democratization.
Democratization of filmmaking, or more general, high quality visual storytelling, seems inevitable. It will come with pains that are hard to overlook - scams, harassment, etc - but the same has happened to democratization of writing. There are always benefits for the voice to come only from the official, or the established, but restricting the access to a few comes with a much bigger downside.
To some degree, we have seen this democratization happening. On X, there are lots of viral videos built with Veo 3. Lots of them are just novelty effects, but some of them are intrinsically good. My favourite is this video about the lives of AI characters - a touching, deep and “meta” story. In another post, someone shared a made-up commercial, and claimed that similar commercials they shot in the past cost 500K dollars. I don’t know if that’s real, but I admit that the made-up commercial is pretty appealing.
Is Filmmaking with GenAI Ready for Prime Time?
If expensive forms of creation like filmmaking are poised to be democratized and we have seen successful examples with GenAI, does it mean that creation with GenAI is ready for prime time? To answer this question, we will have to go back to the essence of creation, which is self expression and fulfillment.
GenAI is a tool for creation, like a paint brush for painting. For it to become a primary tool, it has to have expressiveness, and deliver fulfillment to people using it. That boils down to two things - steerability and predictability, which I will examine thoroughly for the rest of the section.
The breakthrough of high quality video generation amazed lots of people, to the extent that ML researchers think video generation is somehow easier than text generation and is progressing faster than text. However, it is worth noting that videos are a much more engaging form of expression, and because of that, any progress on it has a much larger psychological effect. In terms of steerability, it is still far behind text-generation.

The lack of steerability can be easily tested out and here is just an example. It is a very simple task for today’s LLM to generate text with step by step demonstration of calculating 12 * 12, so let’s see how good a video generation model can do it.
This was Sora’s work:
This was Veo3’s work:
I tried both simply prompting the model for step by step demonstration, and directly giving out the steps for the model to follow, but the results in both bases are equally funny.
If video GenAI techniques are not quite steerable (yet), in what cases will it be steerable? What kind of prompt will it do well? If it doesn’t do well, what details in my prompt should I drop or add to make it do better? That’s the predictability question. While predictability also relies on a user’s experience with the tool, the fact is, GenAI tools are never quite predictable.
Unpredictable rewards have deep psychological implications (the reward in this case is generating a good shot). Century-old research told us that it creates surprise and anticipation, leading to repeated engagement (keep clicking on the retry button) that’s almost irresistible. It deprives creators from the sense of fulfillment, which comes from seeing the connection between their efforts and the success. It is a lose-lose in the long term. The unpredictability can be reduced by generating multiple versions at the same time, but it is not very effective.
Just like using GenAI for vibe coding, my advice for people starting to use GenAI for visual story-telling is not to try too hard on tuning their prompt, because there isn’t a strong correlation between trying hard and getting a good result. Always leverage all the tools available and focus on what you have control over.
The Future of Creation with GenAI
Visual storytelling with GenAI has unlocked opportunities for people who have the talent, but couldn’t afford it in the past. The use cases are still quite limited, and there will always be cases where real physics is much cheaper and better, but as we see in the past 2 years, the fundamental performance of the models and the tool integration are progressing very fast. It is hard to predict what use cases would be unblocked in the next 6 months.
We are at the first chapter of content creation with GenAI. What would the next chapter be like? I don’t know, but I look forward to seeing it soon.