Researchers find widespread weaknesses in AI safety, performance tests: Report