From the outside looking in, the relationship between creatives and AI seems fraught.
There’s all this conversation about who AI might replace and when, but tools like Invoke seem to be putting the power back in the hands of designers.
It’s an AI-powered, pro-grade image generation tool built to streamline designers' creative processes.
Similar to Photoshop, it comes hundreds of features and its usability spans across industries like gaming, TV, retail, architecture, and product design.
I spoke with Invoke’s CEO and founder, Kent Keirsey and he shared:
This product comes at a time where people are trying to figure out how to leverage AI for their creative endeavors. Having a background in product management, what inspired you to build Invoke?
I've always been a little bit of a nerd, creative spirit. My dad was an architect. My granddad was an architect. I grew up around CAD tools and design, and had a significant respect for creative endeavors. I played around with Photoshop and macro media and all the tools growing up. I’m the product manager that jumps into a Figma file and makes some edits in a pinch when I need to.
In August 2022 (this is before ChatGPT), there was an open source model that was released called Stable Diffusion. It was the first open model that really made a big impact on the world. And I stumbled upon it.
I've got a computer that can run this stuff locally, so I looked online and I found a small open-source repository for a command line terminal interface that had been started by the head of Adaptive Oncology at the Ontario Institute of Cancer Research.
He wrote the script. I downloaded it. I was like, “Man, this is cool. I'm having so much fun playing around with this.” It became my hobby to contribute to that project, and I came at it from the angle of product management.
It emerged from this very hacker-esque command line interface into an application. We had a front-end developer who was like, “Hey, I think I could build an app that plugs into this and we could make it really good.” We had the right people in the room talking to each other and making this happen.
A few months later, we were probably the earliest in the entire industry to innovate by having a canvas where you could actually draw, edit, manipulate, and use.
We started getting interface with the big companies in the space. We talked to Stability, they were interested in bringing us on and having us join their team. We were talking to the big chip manufacturers and everyone's starting to get crazy about AI because ChatGPT is released.
That was where I had this decision point for me: I had a path to continue growing into CEO roles, but I was spending my free time exploring this space of creativity.
We were building tools that people were telling us were giving them artistic creative license. They felt more in control using what we were building than anything else out there and I was like, “I'm just going to do this.”
So, I effectively jumped and had the fortune of getting investment early from some of the people I'd worked with in my network. We got our seed round in June and have been building an enterprise product, really focused on delivering this type of solution at scale for the problems that businesses are going to have.
What was your relationship with AI before? You've played around with design tools since childhood, but in terms of AI, was it something you were familiar with pre-AI boom?
I‘ve been paying attention to the machine learning space for a decade, keeping an eye on what’s going on. My degree is in economics and that’s very statistics driven. Machine learning is statistics driven. So, it always made sense to me.
What I‘ve seen over the years is, very few of the businesses I’ve worked with knew how to apply machine learning well to the problems.
Many of our competitors, and much of the industry at large, is selling a vision of AI that's somewhat – at least right now – BS. Saying things like “It's going to solve every problem,” or “You're never going to have to lift a finger.”
Since the beginning, we've had the perspective that human creatives will be needed. You need a human creative to drive this.
There are a lot of people who think, “Well, I‘ve got taste. I’m going to go type my prompt in and make an image and it's good.” And then you see the crap that fills LinkedIn and you‘re like, well, you definitely generated a picture, but it’s pretty low quality and everyone can tell it's AI.
When you put this in an artist's hands and you give them the control that we offer, they can do so many cool things.
You talk to an artist who sketches something and then renders it out in Photoshop. Even with digital tools, it takes about 100 to 150 hours to get a final piece of work done. [With Invoke], one of the artists we worked with can get a piece down to four to eight hours.
We‘re not a replacement for Photoshop. We’re a companion for that type of image editing interface.
That‘s a perfect segue to my next question. You’ve said there are ~1,000 levers on Invoke to help creators get exactly what they’re looking for. If I’m a designer who’s never used an AI tool, or the ones I‘ve used have been like DALLE where I’m just putting in a prompt and getting one result back, what does that learning curve look like?
I wish that the answer was like one size fits all. I think it depends on the individual, but I'll give you a corollary to consider. If you had never seen Photoshop before, how long would it take you to be good at Photoshop?
But I‘ll say this: You can get from zero to one pretty quickly. We put a lot of educational materials online. You can get started with some basics pretty easily. It’s one of those things that's easy to start, difficult to master.
The skill ceiling for this is very high because that's what creative tools ought to be.
We take a very risky approach in building these tools because, everyone else out there is building the simple text box where you type a prompt in and it figures out what you want.
You maybe have one or two sliders and you get this very neat little range where you are never allowed to make a bad image, right? Invoke has no guardrails.
You can generate absolutely terrible things just as well as you can generate things you would never be able to generate in other tools because you don‘t have the control to get there. We try to make it accessible but we’re also on the harder end of the spectrum because we're building a pro-grade tool.
We‘re not trying to build a consumer application. We’re not trying to build something that's going to be the tool you give to your marketing intern and it just works.
You’ve mentioned that using Invoke is an iterative process. How can a business help the tool understand your brand’s visual language?
It‘s kind of like training a new employee. They don’t know your brand, they don't know the rules. What do you do? Well, you show them examples of where our brand is following the rules, and help them develop an understanding of that by looking at materials.
That‘s what an AI model training process looks like. You’re creating a data set, you can do it with as small as 15 images to train it on some concept.
Just like with anyone, if it only has 15 examples, it's going to be less good at it than if it were a thousand examples. The process is effectively just writing a caption about that image to say, “This is the concept I‘m teaching you and here are some other things in the image that aren’t related to the concept.”
Let‘s use the color burgundy as an example. Your brand has a specific idea of what burgundy is, you have 20 products that all come in burgundy, and it’s a specific shade.
You’re going to show the AI model 20 images. This is a hat in burgundy, this is a sweater in burgundy. It's going to say, “OK, I see how burgundy or this specific shade of burgundy is different in each of these different situations.” You're effectively teaching it the relationship between that concept and all of these other things.
Part of the process is showing examples in different contexts and giving enough variation so it's able to say, “I really get how this thing applies independent of other contexts.”
That's where it becomes useful because now, you can generalize that concept and say, “Well, we‘ve never done a shoe before – let’s see what a burgundy shoe would look like.”
It understands that concept independent of the clothing item, so it can now apply that concept in that.
I want to dig into a little bit about copyright and intellectual property. Obviously that‘s a big concern for a lot of creatives and I’m curious, how are you helping brands understand that and how to protect their own IP?
Part of this is education and helping our customers understand the realities of where the tools are today and where the risks are.
There are two big questions around copyright: What went into these models and what comes out of these models.
The U.S. Copyright Office in 2023 came out with a statement that anything that is generated by AI through a basic text prompt, regardless of how laborious you put it, gets no copyright on the output. Their stance is that human prompting is insufficient human expression to control the output.
We've done a lot of work to enhance controllability inside of our tool to provide that level of human expression so that you can make the argument that it merits copyright.
We help collect every ounce of that in an embedded metadata standard that we inject into the image. So every image you generate using our tool has every parameter that went into generating it, including things like your control images, your sketches, etc.
We believe that our metadata standard is going to be helpful in supporting and demonstrating a claim of copyright on the outputs.
We also help organizations think through things like demonstrating what went into your training, e.g.: provenance and logs, so that you can demonstrate this model was trained on this concept, and we shaped this word to mean this thing, and that‘s how we’re using it as a tool, so that end-to-end, you have this level of visibility and control.
What about the risk of creating copyrighted assets?
You are at a much lesser degree of risk if you have a custom model and you're using human expression.
We provide an extra layer on top of that by giving guidance on how to use these tools in a way that decreases that risk.
From a monitoring perspective, we‘re evaluating where organizational desire for additional monitoring is going to be. We’re partnering with an organization called Vera – an early-stage startup whose founder is on the National AI Advisory Committee for the White House – to help us address risks on our side.
One of the things that we do, from a creative perspective, is we don‘t block customers’ prompts. We're not monitoring prompts and saying, “You put this word in and we're not going to generate that picture.”
It‘s a riskier approach – the reason for that is we work with game studios and movie studios. We’ve got artists wanting to show a different side of humanity and they want that to be in their art. And we're not gonna censor an artist.
Now looking ahead, OpenAI blew everyone away with its Sora release. Do you plan on expanding to video?
The horizon is continuously shifting, we are moving into the generation of AI media creation or AI-assisted media creation right now.
We‘re a key player in that equation and we’re going to figure out how we help artists and creatives get better results, whether that's through partnership or building it ourselves.
When we look at what's coming up for us, 3D is not there yet. Video is not there yet. We know that being the best place to control image generation is going to matter in those worlds either way.
We're really focused on offering the best and most comprehensive suite of tools for that image generation process so that we are well positioned to move into that next generation of tools.
What impact do you envision Invoke having on the creative industry?
I hope that we are able to provide the next generation of creative tooling and make it accessible to people so that they can own their intellectual property in a way that I think we're at risk of losing.
There are many of these image generator platforms that are effectively just training on their customers' data, right? And they're making their AI model better.
You can put your content in and you can train it. But you‘re never going to be able to actually take that and use it forever, right? It’s not yours. You're renting access to your own visual language, your own AI model.
There's this entire possible world where people can train on these foundational models and own that for themselves as a creative or as a business. They can have that as an asset that they use to power their generative capabilities going forward.
I am on the side of making AI accessible to people and giving people the ability to benefit from this without having to go to one centralized source that has full control and gives you zero ownership. And that’s what Invoke is trying to build.
Editor’s note: Portions of this interview were edited for brevity and clarity.