Composition relates to the way that the video and audio are presented to those involved in the group video chat. We think that different views will suit different people depending on the context. In some situations it may be good for all participants to see each other all the time; in other cases it may be best to see just on other person in full screen mode. In the performance use case where the remote audience is more passive, we think that the best representation should be the one that creates the best emotional response in the audience. In the socialisation use case the ideal form of representation is more subjective and is explored through questionnaires. To better understand the options as they relate to the socialisation use case, we conducted a laboratory where we compared three different ways of displaying a group call on a computer screen. The different layouts are:

  • Mosaic - participants could see every one including oneself in a mosaic of tiles, arranged in two rows of three equally sized tiles
  • Hangout - similar to Google Hangout where, based on voice detection, the active speaker was displayed in a main window and the five remaining participants were displayed as a row of five tiles at the bottom of the screen.
  • Full Screen - again based on voice detection, participants only saw the active speaker as a full screen image.