Generate Vietnamese speech from text
Generate vivid images from text prompts
OmniParser, turn your LLM into GUI agent