What do i need to build a speech recogniser to control images?

Hello everyone~~

I am currently doing a project that needs to use speech recognizer to control which images would be shown
on the screen in order to create a sort of a translation device via images.
For example, when I say the word “dog”, the system should immediately open the file that I’ve already
programmed to match the word “dog” with an image of a dog that I saved in an SD card.

I’ve already got:

an Arduino UNO
a Voice Recognition V3 ( GitHub - elechouse/VoiceRecognitionV3: Arduino library for elechouse Voice Recognition V3 module )

I am thinking to get an SD card shield and a little screen.

Please help me to see if there are anything I need to make it work.

Please see the diagram to get more understanding of my project.

Thank you very much

If you mean those TFT shields, it will be too slow. I have written an app that writes raw image data from an SD to a 3.2" TFT shield and it is painfully slow. Almost a minute to write the screen. I'm not sure where the bottleneck is, but that is what you will get with the standard libraries.

Something like a PI is probably a better platform for this.

This display lets you preload several images in its on board sram before displaying them, which would give you time to upload a new image while uploading another. Also, if you don't absolutely have to have it load the image from an SD card, you can upload around 23-35 full size images right to the displays flash and have them fully load in under a half a second.

http://www.ebay.com/itm/4-3-inch-TFT-LCD-module-480x272-CPLD-SDRAM-arduino-DUE-MEGA-MD043SD-SSD1963-/111627018023?hash=item19fd7d1f27

you will need a mega or due, however.