Stellar

Intro

With the Stellar benchmarks, we wish to track and report the ongoing progress in the emerging field of personalized human-centric generation. To this end, we investigate how well a personalization method works on our purposed metrics. A personalization method accepts as input an image of a human subject ( S* ) and a text description (a prompt) to place them in some imaginary context.

The evaluation happens on three dimensions:

Personalization: We evaluate how well a personalized human-centric method accurately portrays the human subject ( S* ) in the requested prompt. For example, does the human subject in the generated image resemble the original input image?
Objects: We evaluate how well a personalization method can correctly portray the requested objects in a scene. For example, when a personalization method accepts as prompt

S* watering a palm tree at a snowy day
is the expected "object" (snow) present in the generated output rather than ignored subject to the personalization constraint.
Object Relations: We evaluated how well the relationship between objects and human subjects is portrayed. For example, for a method that is prompted to place an object in a scene and the human subject has to interact with the object, such as

S* playing basketball

In such cases, we would want to evaluate whether S* is interacting with the object, "basketball".

In summary, we evaluate how well a personalization method performs across the three dimensions. For example, Personalization is evaluated by Identity Preservation Score (IPS), Attribute Preservation Score (APS), Stability Identity Score (SIS). Objects is evaluated by Grounding Objects Accuracy (GOA). Lastly, Object Relations is evaluated by Relation Fidelity Score (RFS).

Rules

To evaluate your method on our benchmark you would first need to download and process Stellar- $H$ and Stellar- $T$ following the instructions from our Official Repository. Steps to evaluate your method:

Download and process Stellar-

H

and Stellar-

T

Use Stellar-

H

and Stellar-

T

to generate images. Ref. example.

Use stellar-metrics to compute the statistics on the generated images. Ref. example

Report the results

We also fine-tune ELITE and StellarNet on CelebAMask-HQ where we exclude from the training set the image portion of Stellar- $H$ and Stellar- $T$ . We indicate any method that satisfy the Cross-Val Training constraint with ✅ in the table below. Stellar- $H$ and Stellar- $T$ prompts must be excluded from the pretraining of any method. Any method that includes in the training stage any photos that are also included in the evaluation set of Stellar- $H$ or Stellar- $T$ are denoted with ❌. Textual Inversion and Dreambooth require to be fine-tuned per subject identity and as such they can only be evaluated under that constraint.

If you exclude in the training of your method the images fro Stellar- $H$ and Stellar- $T$ , please indicate it when making a submission.

Leaderboard

Stellar- $T$ Challenge

Paper	IPS	APS	SIS	GOA	RFS	Cross-Val Training
StellarNet	0.637	0.693	0.577	0.305	0.134	✅
ELITE	0.383	0.490	0.355	0.260	0.106	✅
Dreambooth	0.252	0.317	0.232	0.302	0.103	❌
Textual Inversion	0.287	0.510	0.262	0.229	0.082	❌

Stellar- $H$ Challenge

Paper	IPS	APS	SIS	Cross-Val Training
StellarNet	0.622	0.685	0.564	✅
ELITE	0.368	0.449	0.342	✅
Dreambooth	0.246	0.299	0.228	❌
Textual Inversion	0.299	0.419	0.273	❌

Notes

ELITE^* is trained on the training set of CelebAMask-HQ Dataset using the code in the original repo.

Reporting New Results

To report new results on Stellar- $T$ or Stellar- $H$ please send the performance numbers and the accompanying paper link to Alexandros Benetatos.

BibTeX

If you find our work useful in your research, please consider citing:

@article{stellar2023,
  author    = {Achlioptas, Panos and Benetatos, Alexandros and Fostiropoulos, Iordanis and Skourtis, Dimitris},
  title     = {Stellar: Systematic Evaluation of Human-Centric Personalized Text-to-Image Methods},
  volume    = {abs/2312.06116},
  journal   = {Computing Research Repository (CoRR)},
  year      = {2023},
}

Stellar Benchmarks