SpringBatch 实列学习《一》

通过Spring batch的官方文档,学习springbatch, 记录每个实列。

本案例是官方提供的第一个入门实列

官方文档地址

https://spring.io/guides/gs/batch-processing/#scratch

源码地址

https://gitee.com/bseaworkspace/study_java_web/tree/master/springbatchdemobasic

代码结构

 

业务数据

Typically, your customer or a business analyst supplies a spreadsheet. For this simple example, you can find some made-up data in src/main/resources/sample-data.csv:

比如用户提供了一个需要处理的表格数据。


  
  1. Jill,Doe
  2. Joe,Doe
  3. Justin,Doe
  4. Jane,Doe
  5. John,Doe

 

 

这个表格数据 每一行包含了 名和姓 并且用逗号隔开。

接下来,我们需要准备一个sql 用来创建需要的数据表 src/main/resources/schema-all.sql:


  
  1. DROP TABLE people IF EXISTS;
  2. CREATE TABLE people (
  3. person_id BIGINT IDENTITY NOT NULL PRIMARY KEY,
  4. first_name VARCHAR(20),
  5. last_name VARCHAR(20)
  6. );

 

Spring Boot runs schema-@@platform@@.sql automatically during startup. -all is the default for all platforms.

Spring Boot提供两种方法来定义数据库的表结构以及添加数据。

  1. 使用Hibernate提供的工具来创建表结构,该机制会自动搜索@Entity实体对象并创建对应的表,然后使用import.sql文件导入测试数据;
  2. 利用旧的Spring JDBC,通过schema.sql文件定义数据库的表结构、通过data.sql导入测试数据。

这里采用了Spring JDBC的方式,在springboot项目启动的时候,自动创建相关的表。

Spring Boot可以自动创建DataSource的模式(DDL脚本)并初始化它(DML脚本),并从标准的位置 schema.sqldata.sql (位于classpath根目录)加载SQL,脚本的位置可以通过设置 spring.datasource.schemaspring.datasource.data 来改变。此外,Spring Boot将加载 schema-${platform}.sqldata-${platform}.sql 文件(如果存在),在这里 platformspring.datasource.platform 的值,比如,可以将它设置为数据库的供应商名称( hsqldb , h2 , oracle , mysql , postgresql 等)。
Spring Boot默认启用Spring JDBC初始化快速失败特性,所以如果脚本导致异常产生,那应用程序将启动失败。能通过设置spring.datasource.continue-on-error的值来控制是否继续。一旦应用程序成熟并被部署了很多次,那该设置就很有用,例如,插入失败时意味着数据已经存在,也就没必要阻止应用继续运行。
如果想要在一个JPA应用中使用 schema.sql ,那如果Hibernate试图创建相同的表, ddl-auto=create-drop 将导致错误产生。为了避免那些错误,可以将 ddl-auto 设置为""或 none 。
最后要提的一点是,spring.datasource.initialize=false 可以阻止数据初始化。

 


 

 

开始代码实现

 

设置Maven的pom文件

 


  
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  3. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
  4. <modelVersion>4.0.0</modelVersion>
  5. <parent>
  6. <groupId>org.springframework.boot</groupId>
  7. <artifactId>spring-boot-starter-parent</artifactId>
  8. <version>2.4.3</version>
  9. <relativePath/> <!-- lookup parent from repository -->
  10. </parent>
  11. <groupId>com.example</groupId>
  12. <artifactId>batch-processing</artifactId>
  13. <version>0.0.1-SNAPSHOT</version>
  14. <name>batch-processing</name>
  15. <description>Demo project for Spring Boot</description>
  16. <properties>
  17. <java.version>1.8</java.version>
  18. </properties>
  19. <dependencies>
  20. <dependency>
  21. <groupId>org.springframework.boot</groupId>
  22. <artifactId>spring-boot-starter-batch</artifactId>
  23. </dependency>
  24. <dependency>
  25. <groupId>org.hsqldb</groupId>
  26. <artifactId>hsqldb</artifactId>
  27. <scope>runtime</scope>
  28. </dependency>
  29. <dependency>
  30. <groupId>org.springframework.boot</groupId>
  31. <artifactId>spring-boot-starter-test</artifactId>
  32. <scope>test</scope>
  33. </dependency>
  34. <dependency>
  35. <groupId>org.springframework.batch</groupId>
  36. <artifactId>spring-batch-test</artifactId>
  37. <scope>test</scope>
  38. </dependency>
  39. </dependencies>
  40. <build>
  41. <plugins>
  42. <plugin>
  43. <groupId>org.springframework.boot</groupId>
  44. <artifactId>spring-boot-maven-plugin</artifactId>
  45. </plugin>
  46. </plugins>
  47. </build>
  48. </project>

业务实体类

 

Now that you can see the format of data inputs and outputs, you can write code to represent a row of data, as the following example shows:

用来定义数据的输入和输出的格式,一个实体类的对象,就对于前面表格数据的一行数据。


  
  1. package com.xsz.entity;
  2. public class Person {
  3. private String lastName;
  4. private String firstName;
  5. public Person() {
  6. }
  7. public Person(String firstName, String lastName) {
  8. this.firstName = firstName;
  9. this.lastName = lastName;
  10. }
  11. public void setFirstName(String firstName) {
  12. this.firstName = firstName;
  13. }
  14. public String getFirstName() {
  15. return firstName;
  16. }
  17. public String getLastName() {
  18. return lastName;
  19. }
  20. public void setLastName(String lastName) {
  21. this.lastName = lastName;
  22. }
  23. @Override
  24. public String toString() {
  25. return "firstName: " + firstName + ", lastName: " + lastName;
  26. }
  27. }

Create an Intermediate Processor 创建中间处理类

A common paradigm in batch processing is to ingest data, transform it, and then pipe it out somewhere else. Here, you need to write a simple transformer that converts the names to uppercase. The following listing 

这个中间处理类,主要是把输入对象的姓名属性的值,转换成大写。

 


  
  1. package com.xsz.processor;
  2. import com.xsz.entity.Person;
  3. import org.slf4j.Logger;
  4. import org.slf4j.LoggerFactory;
  5. import org.springframework.batch.item.ItemProcessor;
  6. public class PersonItemProcessor implements ItemProcessor<Person, Person> {
  7. private static final Logger log = LoggerFactory.getLogger(PersonItemProcessor.class);
  8. @Override
  9. public Person process(final Person person) throws Exception {
  10. final String firstName = person.getFirstName().toUpperCase();
  11. final String lastName = person.getLastName().toUpperCase();
  12. final Person transformedPerson = new Person(firstName, lastName);
  13. log.info("Converting (" + person + ") into (" + transformedPerson + ")");
  14. return transformedPerson;
  15. }
  16. }

Put Together a Batch Job 创建批处理job

 

  •  @Configuration  放在java类上面,表示这个是配置类,代替xml文件
       
  • @EnableBatchProcessing
       

@EnableBatchProcessing
这个注解的作用,和spring 家庭中的@Enable* 系列注解功能很类似。顾名思义,就是让我们可以运行Spring Batch。

在配置类上打上这个注解,spring 会自动 帮我们生成一系列与spring batch 运行有关的bean,并交给spring容器管理,而当我们需要这些beans时,只需要用一个@autowired就可以实现注入了。

自动生成的bean及名称如下:

JobRepository - bean name "jobRepository"
 
JobLauncher - bean name "jobLauncher"
 
JobRegistry - bean name "jobRegistry"
 
PlatformTransactionManager - bean name "transactionManager"
 
JobBuilderFactory - bean name "jobBuilders"
 
StepBuilderFactory - bean name "stepBuilders"

@EnableBatchProcessing 背后所调用的接口是BatchConfigurer

我们可以改变@EnableBatchProcessing 给我们的对象。

比如,我们想让返回的jobRepository用上我们自定义的数据源,想让元数据的表前缀变成SPRING_BATCH_,而不是默认的BATCH_前缀,@EnableBatchProcessing 提供了让我们覆写的接口。
 


 

 

The first chunk of code defines the input, processor, and output. 一个chunk 需要配置对应的 输入,中间处理,输出

  • reader() creates an ItemReader. It looks for a file called sample-data.csv and parses each line item with enough information to turn it into a Person.  reader()方法里面定义了数据输入的规则,读取csv的文件内容,每行对应一个person对象

  • processor() creates an instance of the PersonItemProcessor that you defined earlier, meant to convert the data to upper case. 把reader读到jvm里面的person对象属性值变成大写。

  • writer(DataSource) creates an ItemWriter. This one is aimed at a JDBC destination and automatically gets a copy of the dataSource created by @EnableBatchProcessing. It includes the SQL statement needed to insert a single Person, driven by Java bean properties.  


  
  1. package com.xsz.config;
  2. import com.xsz.entity.Person;
  3. import com.xsz.processor.PersonItemProcessor;
  4. import org.springframework.batch.core.Job;
  5. import org.springframework.batch.core.Step;
  6. import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
  7. import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
  8. import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
  9. import org.springframework.batch.core.launch.support.RunIdIncrementer;
  10. import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
  11. import org.springframework.batch.item.database.JdbcBatchItemWriter;
  12. import org.springframework.batch.item.database.builder.JdbcBatchItemWriterBuilder;
  13. import org.springframework.batch.item.file.FlatFileItemReader;
  14. import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
  15. import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
  16. import org.springframework.beans.factory.annotation.Autowired;
  17. import org.springframework.context.annotation.Bean;
  18. import org.springframework.context.annotation.Configuration;
  19. import org.springframework.core.io.ClassPathResource;
  20. import javax.sql.DataSource;
  21. @Configuration
  22. @EnableBatchProcessing
  23. public class BatchConfiguration {
  24. @Autowired
  25. public JobBuilderFactory jobBuilderFactory;
  26. @Autowired
  27. public StepBuilderFactory stepBuilderFactory;
  28. // ...
  29. @Bean
  30. public FlatFileItemReader<Person> reader() {
  31. return new FlatFileItemReaderBuilder<Person>()
  32. .name("personItemReader")
  33. .resource(new ClassPathResource("sample-data.csv"))
  34. .delimited()
  35. .names(new String[]{"firstName", "lastName"})
  36. .fieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
  37. setTargetType(Person.class);
  38. }})
  39. .build();
  40. }
  41. @Bean
  42. public PersonItemProcessor processor() {
  43. return new PersonItemProcessor();
  44. }
  45. @Bean
  46. public JdbcBatchItemWriter<Person> writer(DataSource dataSource) {
  47. return new JdbcBatchItemWriterBuilder<Person>()
  48. .itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
  49. .sql("INSERT INTO people (first_name, last_name) VALUES (:firstName, :lastName)")
  50. .dataSource(dataSource)
  51. .build();
  52. }
  53. @Bean
  54. public Job importUserJob(JobCompletionNotificationListener listener, Step step1) {
  55. return jobBuilderFactory.get("importUserJob")
  56. .incrementer(new RunIdIncrementer())
  57. .listener(listener)
  58. .flow(step1)
  59. .end()
  60. .build();
  61. }
  62. @Bean
  63. public Step step1(JdbcBatchItemWriter<Person> writer) {
  64. return stepBuilderFactory.get("step1")
  65. .<Person, Person> chunk(10)
  66. .reader(reader())
  67. .processor(processor())
  68. .writer(writer)
  69. .build();
  70. }
  71. }

 

文章来源: blog.csdn.net,作者:bseayin,版权归原作者所有,如需转载,请联系作者。

原文链接:blog.csdn.net/h356363/article/details/115024776

(完)