As mentioned in my previous post, I spent the last day coaxing ChatGPT (using GPT-4 as a model) to help me code an app I can run locally to write books in the style of fictional encyclopedia entries more rapidly.
I suspect a real programmer would have a lot easier time getting good code results out of ChatGPT, because they’d know better how to work efficiently in general. When I do the next one, I will hopefully be a lot better at it.
What I found
- Title [single line text input] - Summary [multi line text input] - Sections [single line text input, narrow] - Submit button - After submit button is pressed: - For each item marked in sections, make one: - [#] [single line text input] - check icon - refresh icon - trash icon - pressing this icon removes the section item - + [button] - pressing this button adds a new section
Then I gradually added more steps and complexity and iterations (way too much for the first try, tbh). It was able to keep up, but eventually I hit a bunch of snags. There was quite a long while where I was in a no-man’s land of repeated errors, with ChatGPT bouncing me back and forth on minor changes that didn’t resolve anything. But eventually after many many hours, I persevered, ironed out all the (main) bugs, and succeeded.
Here’s a screenshot of part of the app:
As you can see, the functionality is basically like this:
- Enter in a book title
- Enter a book summary
- Pick the number of sections to generate
- Tweak the generated section titles, re-order them, delete them, add a new one, or refresh, set a word count for the section
- The section titles form the basis for expanded body text that build on the topic of the section, the book title, and the summary. You can also click refresh to regenerate the section.
- I don’t show it here, but when you’re done, there is another save button that outputs everything into text that has proper headers and paragraps, so that I can copy-paste it into Vellum easily.
And that’s basically it. Though I will say that for a beginner app-building experience with ChatGPT, it was plenty frickin’ hard to pull off.
Process & Issues
I used Dreamweaver as my coding environment, since I have an Adobe Creative Cloud subscription already. I’m not sure if I like it. I thought it would be better. I might just use Sublime Text next time.
I previewed my HTML in Firefox, cause that’s my main browser.
You will need to turn on the console in your browser so you can see errors. You can also ask ChatGPT to code in logging to your console, so you can see other values being passed or returned via API.
When you get an error, you just copy paste it into ChatGPT.
It helps to frequently copy paste your latest code into ChatGPT so it knows what you’re up to. It has a much bigger memory than GPT 3.5, which is one major advantage, plus the quality of completions is much better, imo.
Expect to get a lot of run-around from ChatGPT while you’re squashing bugs. It’s not omniscient, and even when it says something is going to fix the error, it usually takes like 10 or 20 tries or something before it’s really true. It’s more just bumbling around like a person does than I thought it would.
Sometimes it will tell you wrong things about your code. It will say
x is or isn’t in your code, and that you should replace
z, but when you check, there is no
y anywhere to be found. That’s why frequently pasting in your latest code version is very helpful.
In fact all the copy pasting is the biggest time waster of the whole thing.
The most obvious solution would be to integrate ChatGPT right into the development environment. At first, I thought this must be what Github Copilot does, since it’s backed by OpenAI tech:
But that’s not really how Copilot works apparently (correct me if I’m wrong, I didn’t try it). It seems to be more like autocomplete for code. So you have to actually know something about coding to get started. I’m also not sure if it can see like your console, and interpret errors? I don’t think you can quite just use plain language to communicate it like in a chat format.
I also don’t think it would let you go into your UI in the browser (WSIWYG style), highlight something and say like “make it do this ___”. Obviously you can’t quite do that in ChatGPT either (yet) since it can’t see your browser, but you can at least tell it that in chat. I think you can sorta kinda do that with commented text in Copilot? But I probably won’t explore it right now. Too many side-tracks to get diverted by…
So yeah, what’s needed is a tighter integration with the dev environment, the browser (including the console & UI itself). There’s an opportunity to build a killer product in that space that combines those, for sure.
Initial Output Results
Getting the prompt results of the longer body text entry fields to actually relate meaningfully to my book title, summary, and section titles is oddly difficult? I believe I’m hitting
davinci-codex which the system lead me to believe is equivalent to GPT-3.5-turbo, but I’m honestly not entirely sure.
After a good amount of tinkering, here’s one example text excerpt it generated:
(The chickens have started a space program.) (They are not as smart as we are, but they are smarter than we were a generation ago.) (They are doing well in the space race.) (They have discovered a new form of energy.) (They have published a book on exotic chickens.) (They have made a movie about exotic chickens.) …
I can’t figure out yet how to get it not all weird and junky like this, but I’ll read more around on it. I’m not that concerned though, as I believe when they start granting access to GPT-4 via the API, probably the quality of text output will also improve – since there has been a noticeable improvement between 3.5 and 4 in text completion quality.
According to my OpenAI API billing page, I’ve only so far spent $0.29 while building and testing this app. I thought it was going to be significantly higher.
Based on my experiments so far, I would estimate that once I have more of the kinks worked out in getting the text output quality up to snuff, that it’s likely producing one ebook of around 2-2.5K words might end up costing something along the lines of $0.05 maybe? I’m not sure yet, as it would depend on how many section titles, and body text entries you generate or regenerate during a session for a single book. But it’s certainly well below ten cents.
Time Per Book
“Writers” with a capital W probably aren’t going to like this, but again once the quality of output is improved, I would guess based on what I’m seeing in ChatGPT Plus, that this would shrink the time to generate a 2.5K word book down to something like 5-10 minutes for the full text generation, if you’re just basically accepting whatever it comes up with for your content. More thoughtful editing might be like 15-30 minutes if you’re quick.
Another thing I did writing my last two ebook editions with GPT-4 is ask for the model to give me image theme suggestions. They were in general pretty good, and helped simplify my generations process I usually do in PlaygroundAI.com. It’s still tedious and time-consuming to do that when you’re doing a new book every few days though. I usually start with about 150 or so generated images I download, and then cull that down to about 80 of the best for the particular volume.
What I’d like is also to have my app suggest image themes based on the fully generated text. Then I could use those right within the app to hit a third-party API for either Dall-e or Stable Diffusion (preferably both), and have it not only generated a bunch of images (and allow me to pick the best), but also distribute them within the text. This way I could copy-paste it all into vellum in quick order. Otherwise that whole process takes a lot of time as well to arrange them within chapters in Vellum.
Ideally, if it gets working the way I want, I could create and curate images into the text as rapidly as I generate the text. But let’s say for argument’s sake, I could create a full book in maybe an hour or a little more? I’d still need to do some image work for the cover & previews in Photoshop, and then post to Gumroad. But I think this is a pretty achievable goal with another day or two of hacking away at my code.
I thought going into this I was going to be more “reliant” on ChatGPT. Certainly, I was. But also, since the model is wrong so often, and I had to do so much careful checking, I ended up feeling instead like it was accelerating my learning. Instead of reducing my agency, it increased my agency. And now I’m feeling like I’m fully equipped to go out and tackle tons of other problems I would never have been able to do myself in a million years otherwise. All told, major win.
This experience also has radically turned down the volume on just about all the other complainerism & trolling people are doing with ChatGPT. Like, yeah, it’s gonna spit out wrong facts. In a way, who really cares (though I have thoughts on “truth” and APIs I’ll write another time)? To me that’s a far less interesting problem than what I can actively build with this, now that these coding skills are within my grasp that never were before. And this shit is only just getting started!