A reference image that provides assets to the generated video, such as the scene, an object, a character, etc.
A reference image that provides aesthetics including colors, lighting, texture, etc., to be used as the style of the generated video, such as 'anime', 'photography', 'origami', etc.
Enum for the reference type of a video generation reference image.