Kohei Nozaki's blog

Partial data dump/restore

Posted on Tuesday Mar 11, 2014 at 03:45PM in Technology

dump data

hogedb=# copy (select * from hogetable where basedate between '20070101' and '20070131') TO '/tmp/1month.sql';
COPY 561368
hogedb=#

dump schema

pg_dump --schema-only -t hogetable hogedb > /tmp/1month.ddl

restore schema

hogedb=# \i /tmp/1month.ddl 
SET
SET
SET
SET
SET
SET
SET
SET
CREATE TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
CREATE INDEX
CREATE INDEX
hogedb=#

restore data

hogedb=# copy hogetable from '/tmp/1month.sql';
COPY 561368
hogedb=#

References

COPY

Comments [0]

Tags: postgres

Chunk-oriented Processing with Hibernate's StatelessSession

Posted on Tuesday Mar 11, 2014 at 08:24AM in Technology

Bulk fetching is difficult with standard JPA API, but provider-specific functions make it easy.
This is a example of such provider-specific bulk fetching function of Hibernate with JSR352.

According to JSR 352 Implementation - Connection close probl… | Community, this example seems to be wrong. every cursors need to be opened and closed in a transaction. so another approach is required, e.g. fetch primary keys in open(), then fetch each rows in readItem().

Environment

WildFly8.0.0.Final
Hibernate 4.3.1.Final
Oracle JDK7u51
PostgreSQL 9.2.4

Example

ItemReader

package org.nailedtothex.jbatch.example.chunk;

@Named
public class ChunkItemReader extends AbstractItemReader {
    @PersistenceContext
    EntityManager em;
    ScrollableResults scroll;
    StatelessSession ss;
    Session session;

    @Override
    public void open(Serializable checkpoint) throws Exception {
        session = em.unwrap(Session.class);
        ss = session.getSessionFactory().openStatelessSession();
        scroll = ss.createQuery("SELECT c FROM ChunkInputItem c ORDER BY c.id").scroll(ScrollMode.FORWARD_ONLY);
    }

    @Override
    public Object readItem() throws Exception {
        if (!scroll.next()) {
            return null;
        }

        return scroll.get(0);
    }

    @Override
    public void close() throws Exception {
        try {
            scroll.close();
        } catch (Exception e) {
        }
        try {
            ss.close();
        } catch (Exception e) {
        }
        try {
            session.close();
        } catch (Exception e) {
        }
    }
}

I'm not sure that close() procedures are mandatory or not.

ItemWriter

package org.nailedtothex.jbatch.example.chunk;

@Named
public class ChunkItemWriter extends AbstractItemWriter {
    private static final Logger log = Logger.getLogger(ChunkItemWriter.class.getName());

    @Override
    public void writeItems(List<Object> items) throws Exception {
        log.log(Level.FINE, "chunkItemWriter#writeItems(): {0}", new Object[] { items });
    }
}

persistence.xml

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1"
    xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
    <persistence-unit name="jbatchsls">
        <provider>org.hibernate.ejb.HibernatePersistence</provider>
        <jta-data-source>java:jboss/jdbc/JBatchTestDS</jta-data-source>
        <properties>
            <property name="javax.persistence.schema-generation.database.action" value="none" />
            <property name="hibernate.connection.release_mode" value="on_close"/>  
        </properties>
    </persistence-unit>
</persistence>

“hibernate.connection.release_mode” property is necessary.
- NOTE: use of this declaration is highly discouraged. see the additional note in the bottom of this article which explains the another way to realize it.
If this is not set, then you will get:

15:24:59,047 INFO  [org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl] (batch-batch - 19) HHH000106: Forcing container resource cleanup on transaction completion
15:24:59,048 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (batch-batch - 19) SQL Error: 0, SQLState: null
15:24:59,048 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (batch-batch - 19) The result set is closed.
15:24:59,048 ERROR [org.jberet] (batch-batch - 19) JBERET000007: Failed to run job chunk, doChunk, org.jberet.job.model.Chunk@16d1364: org.hibernate.exception.GenericJDBCException: could not advance using next()
    at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54) [hibernate-core-4.3.1.Final.jar:4.3.1.Final]
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126) [hibernate-core-4.3.1.Final.jar:4.3.1.Final]
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112) [hibernate-core-4.3.1.Final.jar:4.3.1.Final]
    at org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:125) [hibernate-core-4.3.1.Final.jar:4.3.1.Final]
    at org.nailedtothex.jbatch.example.chunk.ChunkItemReader.readItem(ChunkItemReader.java:31) [classes:]
    at org.jberet.runtime.runner.ChunkRunner.readItem(ChunkRunner.java:343) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.ChunkRunner.readProcessWriteItems(ChunkRunner.java:288) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.ChunkRunner.run(ChunkRunner.java:190) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.StepExecutionRunner.runBatchletOrChunk(StepExecutionRunner.java:204) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:131) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:162) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runFromHeadOrRestartPoint(CompositeExecutionRunner.java:88) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.jberet.runtime.runner.JobExecutionRunner.run(JobExecutionRunner.java:58) [jberet-core-1.0.1.Beta-SNAPSHOT.jar:1.0.1.Beta-SNAPSHOT]
    at org.wildfly.jberet.services.BatchEnvironmentService$WildFlyBatchEnvironment$1.run(BatchEnvironmentService.java:149) [wildfly-jberet-8.0.0.Final.jar:8.0.0.Final]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_51]
    at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_51]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51]
    at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51]
    at org.jboss.threads.JBossThread.run(JBossThread.java:122)
Caused by: java.sql.SQLException: The result set is closed.
    at org.jboss.jca.adapters.jdbc.WrappedResultSet.checkState(WrappedResultSet.java:4081)
    at org.jboss.jca.adapters.jdbc.WrappedResultSet.next(WrappedResultSet.java:1855)
    at org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:120) [hibernate-core-4.3.1.Final.jar:4.3.1.Final]
    ... 16 more

pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>jbatchsls</groupId>
    <artifactId>jbatchsls</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>war</packaging>
    <properties>
        <maven.compiler.source>1.7</maven.compiler.source>
        <maven.compiler.target>1.7</maven.compiler.target>
        <failOnMissingWebXml>false</failOnMissingWebXml>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.nailedtothex</groupId>
            <artifactId>jbatchif</artifactId>
            <version>0.0.1-SNAPSHOT</version>
        </dependency>
        <dependency>
            <groupId>javax</groupId>
            <artifactId>javaee-api</artifactId>
            <version>7.0</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.hibernate</groupId>
            <artifactId>hibernate-core</artifactId>
            <version>4.3.1.Final</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.11</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.hamcrest</groupId>
            <artifactId>hamcrest-core</artifactId>
            <version>1.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.jboss</groupId>
            <artifactId>jboss-remote-naming</artifactId>
            <version>2.0.0.Final</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.jboss.xnio</groupId>
            <artifactId>xnio-nio</artifactId>
            <version>3.2.0.Final</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

ChunkInputItem

package org.nailedtothex.jbatch.example.chunk;

import java.io.Serializable;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;

@Entity
public class ChunkInputItem implements Serializable {
    private static final long serialVersionUID = 1L;

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    @Column(nullable = false)
    private Long id;

    @Column
    private Integer input;

    @Column
    private Boolean processed;

    public Long getId() {
        return id;
    }

    public void setId(Long id) {
        this.id = id;
    }

    public Integer getInput() {
        return input;
    }

    public void setInput(Integer input) {
        this.input = input;
    }

    public Boolean getProcessed() {
        return processed;
    }

    public void setProcessed(Boolean processed) {
        this.processed = processed;
    }

    @Override
    public String toString() {
        return "ChunkInputItem [id=" + id + ", input=" + input + ", processed=" + processed + "]";
    }
}

Test data

jbatcharts=# select * from chunkinputitem ;
 id | input | processed 
----+-------+-----------
  0 |     0 | f
  1 |    10 | f
  2 |    20 | f
  3 |    30 | f
  4 |    40 | f
  5 |    50 | f
  6 |    60 | f
  7 |    70 | f
  8 |    80 | f
  9 |    90 | f
(10 rows)

jbatcharts=#

Log

15:29:03,916 DEBUG [org.hibernate.SQL] (batch-batch - 20) select chunkinput0_.id as id1_0_, chunkinput0_.input as input2_0_, chunkinput0_.processed as processe3_0_ from ChunkInputItem chunkinput0_ order by chunkinput0_.id
15:29:03,918 FINE  [org.nailedtothex.jbatch.example.chunk.ChunkItemWriter] (batch-batch - 20) chunkItemWriter#writeItems(): [ChunkInputItem [id=0, input=0, processed=false], ChunkInputItem [id=1, input=10, processed=false], ChunkInputItem [id=2, input=20, processed=false]]
15:29:03,920 FINE  [org.nailedtothex.jbatch.example.chunk.ChunkItemWriter] (batch-batch - 20) chunkItemWriter#writeItems(): [ChunkInputItem [id=3, input=30, processed=false], ChunkInputItem [id=4, input=40, processed=false], ChunkInputItem [id=5, input=50, processed=false]]
15:29:03,922 FINE  [org.nailedtothex.jbatch.example.chunk.ChunkItemWriter] (batch-batch - 20) chunkItemWriter#writeItems(): [ChunkInputItem [id=6, input=60, processed=false], ChunkInputItem [id=7, input=70, processed=false], ChunkInputItem [id=8, input=80, processed=false]]
15:29:03,925 FINE  [org.nailedtothex.jbatch.example.chunk.ChunkItemWriter] (batch-batch - 20) chunkItemWriter#writeItems(): [ChunkInputItem [id=9, input=90, processed=false]]

A way to avoid “hibernate.connection.release_mode=on_close”

ItemReader like this works with no declaration of “hibernate.connection.release_mode=on_close” which highly discouraged[2].

public abstract class AbstractHibernateItemReader extends AbstractItemReader {
    @PersistenceContext
    EntityManager em;
    @Resource
    DataSource ds;
    Connection cn;
    ScrollableResults scroll;
    StatelessSession ss;
    Session session;

    @Override
    public void open(Serializable checkpoint) throws Exception {
        cn = ds.getConnection();
        cn.setHoldability(ResultSet.HOLD_CURSORS_OVER_COMMIT);
        session = em.unwrap(Session.class);
        ss = session.getSessionFactory().openStatelessSession(cn);
        scroll = ss.createQuery("SELECT c FROM ChunkInputItem c ORDER BY c.id").scroll(ScrollMode.FORWARD_ONLY);
    }
...
    @Override
    public void close() throws Exception {
        try {
            scroll.close();
        } catch (Exception e) {
        }
        try {
            ss.close();
        } catch (Exception e) {
        }
        try {
            session.close();
        } catch (Exception e) {
        }
        try {
            cn.close();
        } catch (Exception e) {
        }
    }
...

Ugly but I guess that it would be better for some occasions.
There's no proper way to set holdability of resultset with StatelessSession[3].
Configure through Session#doWork() and SessionImpl#connection() didn't worked as expectedly.
- Both of them brings “The result set is closed”.

References

Comments [0]

Tags: jbatch

About Me

Java backend engineer obsessed with test driven development. Apache Roller committer. kohei at apache dot org

« March 2014 »
Sun	Mon	Tue	Wed	Thu	Fri	Sat
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31
Today

Hot tags

Here are the most frequently used tags in the blog. Hover over a tag to see the number of posts, click a tag to see the most recent posts with the tag.

Twitter

Tweets by nu_zayats

Kohei Nozaki's blog

Partial data dump/restore

dump data

dump schema

restore schema

restore data

References

Chunk-oriented Processing with Hibernate's StatelessSession

Environment

Example

ItemReader

ItemWriter

persistence.xml

pom.xml

ChunkInputItem

Test data

Log

A way to avoid “hibernate.connection.release_mode=on_close”

References

About Me

Archives

Hot tags

Recent entries

Recent comments

Twitter