Java 读写 CSV 文件

很早就想梳理一下 Java 读写 CSV 的 “最佳” 实践, 就像处理 Excel 目前会更倾向于用 easyexcel.

老版本 2.3

第一次接触到 opencsv 包, 是因为要在 Hive UDF 中加载 csv 字典, 发现通过传递依赖进来了 opencsv, 于是就直接用了

<dependency>
  <groupId>net.sf.opencsv</groupId>
  <artifactId>opencsv</artifactId>
  <version>2.3</version>
</dependency>
@Resource
private SqlSessionFactory sqlSessionFactory;

@Test
public void testWriteToCSV() {
    String sql = "select user_id,nick_name from ut_user_profile";
    try (
            SqlSession sqlSession = sqlSessionFactory.openSession();
            Connection conn = sqlSession.getConnection();
            Statement stmt = conn.createStatement();
            ResultSet rs = stmt.executeQuery(sql);
            PrintWriter writer = new PrintWriter(new File("/tmp/a.csv"));
            CSVWriter csvWriter = new CSVWriter(writer)
    ) {
        csvWriter.writeAll(rs, true);
    } catch (SQLException | IOException e) {
        e.printStackTrace();
    }
}

@Test
public void testReadFromCSV() {
    CsvToBean<UserProfile> csvToBean = new CsvToBean<>();
    HeaderColumnNameMappingStrategy<UserProfile> strategy = new HeaderColumnNameMappingStrategy<>();
    strategy.setType(UserProfile.class);
    try (
            BufferedReader reader = new BufferedReader(new FileReader(new File("/tmp/a.csv")));
            CSVReader csvReader = new CSVReader(reader)
    ) {
        System.out.println(csvToBean.parse(strategy, csvReader));
    } catch (IOException e) {
        e.printStackTrace();
    }
}

@Data
public static class UserProfile {
    private Integer user_id;
    private String nick_name;
}

生成的 CSV 文件, 内容如下

"user_id","nick_name"
"1","无'忌"
"2","mu""c""he,n"

解析结果如下

UserProfile(userId=1, nickName=无'忌)
UserProfile(userId=2, nickName=mu"c"he,n)

新版本 5.4

新版本对 BeanToCsv, CsvToBean 支持更好!

<dependency>
  <groupId>com.opencsv</groupId>
  <artifactId>opencsv</artifactId>
  <version>5.4</version>
</dependency>
@Resource
private UtUserProfileMapper utUserProfileMapper;

@Test
public void testWriteToCSV() {
    List<UserProfile> profileList = utUserProfileMapper.selectAll().stream()
            .map(UserProfile::convert).collect(Collectors.toList());
    try (Writer writer = new FileWriter("/tmp/a.csv")) {
        StatefulBeanToCsv<UserProfile> beanToCsv = new StatefulBeanToCsvBuilder<UserProfile>(writer).build();
        beanToCsv.write(profileList);
    } catch (IOException | CsvDataTypeMismatchException | CsvRequiredFieldEmptyException e) {
        e.printStackTrace();
    }
}

@Test
public void testReadFromCSV() {
    try (Reader reader = new FileReader("/tmp/a.csv")) {
        List<UserProfile> profileList = new CsvToBeanBuilder<UserProfile>(reader)
                .withType(UserProfile.class).build().parse();
        System.out.println(profileList);
    } catch (IOException e) {
        e.printStackTrace();
    }
}

@Data
public static class UserProfile {
    private Integer userId;
    private String nickName;

    public static UserProfile convert(UtUserProfile profile) {
        UserProfile userProfile = new UserProfile();
        BeanUtils.copyProperties(profile, userProfile);
        return userProfile;
    }
}

生成的 CSV 文件, 内容如下

"NICKNAME","USERID"
"无'忌","1"
"mu""c""he,n","2"

发现手动交换前后两列, 并不影响读取解析为 Bean.