Assessing subgroup effects with binary data: can the use of different effect measures lead to different conclusions?