My first thought is that you may not have enough in terms of sample size to get meaningful results. I would do at least 500 on both sides, which is still small. I could be wrong, maybe you are testing larger sample.
Another thing is that you should be measuring click to open rate as your winning criteria if you are doing whole email tests.
This is a great test by the way, I did same one and text emails performed better every time.
I always declare an a/b test winner manually. The reason for this is because there are so many false positives when it comes to metrics like opens and clicks. Additionally, each email can have it's own success metrics that are not necessarily measured by opens and clicks. For example, regardless of opens and overall clicks, if an email had better click-thru rates for a specific link, that might be a better success metric than overall clicks.
For these reasons, I also let the test ride and manually choose a winner when I feel the email performed it's goal better than the other test variant.
Interesting responses. Thanks!
It was a rather small sample size, around 500.
I've got some set up for next week and will manually declare the winner.