
Video is absolutely something that AI can do so I thought I’d test Pika.
This bills itself as video on command. So, if you need something this can do it for you.
There’s two parts I looked at. Creating video and also lip synching so adding someone else’s voice to existing video.
You can find the site here.
Creating video
Firstly, I uploaded an image of the British seaside from Wikimedia Commons that I’m using under a creative commons license. This gave it something to work with.
It’s an amusement arcade on a wet day in Burnham-on-Sea in Somerset. What can I say? I’m a child of the 1970s. I saw a lot of these on day trips to Rhyl in North Wales.
Pic credit: Wikimedia commons / David Martin.
The first prompt was this…
This is a British seaside amusement arcade. We can hear seagulls, rain and the distant bleep of slot machines.
It brought this…
Which was okay, but those seagulls were not making it. I hadn’t actually noticed the people in the picture so was pleasantly surprised to see them identified as humans and walking like them too.
But I wasn’t happy. So, I tried again. This time I wanted more rainfall and flashing lights in the amusements.
This is a British seaside amusement arcade. We can hear rainfall and the distant bleep of slot machines. There are lights flashing in the amusement arcade. We can also see rainfalling on the damp road and the reflection of approaching cars.
This brought a better result.
And asking it to retry made for better results although the audio is still not there.
All useful. So, I thought I’d take a look at what the results would be if I gave Pika nothing to work with.
Here is the anime version of that same prompot
This is a British seaside amusement arcade. We can hear rainfall and the distant bleep of slot machines. There are lights flashing in the amusement arcade. We can also see rainfalling on the damp road and the reflection of approaching cars.
No, that’s not what I’m after. But my own fault for choosing a Japanese film style.
And then finally just text.
This is not what I’m after at all. The cars look like they are from the 1950s, the signs look foreign and so does the road surface.
But hey, that’s on me.
Lyp synching
One interesting part of Pika is the ability to take video and merge it with imported audio.
I thought I’d give this a test by uploading two rtandom videos and seeing how they knit together.
So a couple of seconds of the vox pop in Bristol saying ‘another one?’ in an incredulous voice was the fiorst one I found.
Then I added a short clip of Roy Keane being interviewed.
To make this…
Not perfect but at first glance, it’s the audio being mixed with the video to give the Bristol woman Roy Keane’s voice.
Technical part
For the video, I’m able to change the tilt, zoom, pan and rotate and I’m also able to tweak prompts and retry. The moral of the story here is to keep trying and experimenting.
Conclusion
As something to add life to existing images this looks fantastic. I’m not convinced I’d use straight text to video. Overall, this could be useful for cutaways and supporting footage. The sounds I’m not that impressed with but the lip synching is dangerously good. I’m not sure ethically whay you’d use it.
Text to video
This was not good. Yes, it’s amazing that some words can end up as video and you’ve got to give them that but in terms of using this I wouldn’t.
SCORE: 1 out of 5.
Text to video with an uploaded image
This was really good and I can see a definite role in animating images for cutaways and better story telling.
SCORE 4 out of 5.
Sound
This was bad. Yes, there were slot machines but they were implausible. You’d need an alternative source for your audio.
SCORE: 0 out of 5.
Lip synching
It’s very easy to come up with something that at first glance looks the part. The ethics of it is another thing.
SCORE: 3.5 out of 5.
The site is free but at $8 (£6.27) a month you can take the Pika logo off, extend the video length and frame resolution and buy extra credits. The pro version is $58 (£45.46) a month.
Pika can be found here.