the-data-mining-blog

Thursday, December 15, 2005

JSR-000247 Data Mining 2.0 - Early Draft Review

JSR-000247 Data Mining 2.0 - Early Draft Review

The first release of Java Data Mining (JSR-73) has been available for over a year and has
seen several commercially available implementations as well as being used by companies
deploying data mining functionality internally. We have also seen interest from the academic
realm. For the first release, the expert group was cautious not to over extend our
reach in an effort to produce version 1. However, we realized there were still many areas
in data mining that deserved attention.
This early review draft, as specified per the Java Community Process 2.6, provides a
broader reader audience the opportunity to provide feedback to the expert group on the
evolving JSR-247 supporting JDM 2.0.
The various enhancements to the standard include:
Transformations - a much requested and difficult subject for data mining in general.
JDM 2.0 puts in place a general framework for performed commonly used transformations
as well open-ended transformations through the use of language-specific expressions.
Time Series - this mining function expands the mining functions supported by JDM and
provides an important capability for supporting forecasting and series data analysis.
Apply for Association - this completes the association mining function making the prediction
of cross-sell items easier.
Multi-record real-time scoring - enable scoring of multiple records in the record apply
task as a performance optimization for applications.
Multi-target models - enable the specification of multiple targets for supervised models
as a model performance and representation optimization. This also enables a performance
optimization for processing common predictor data more efficiently.
Multivariate statistics - provides the ability to conveniently compute multivariate statistics
such as the F and T tests, K-S and M-W tests, among others. This provides an extensible
framework for additional statistics. As with univariate statistics, models that produce
multivariate statistics as a by-product can associate these with the model itself.
Text Mining - this is an initial extension for supporting text mining in JDM. It allows vendors
to automate the term extraction process for users wanting to include unstructured text
data in the building of their data mining models.
Task dependencies and scheduling - this extension allows programs to set up multiple
tasks, where one depends on another, for automatic sequential execution without control
from the client application. In addition, tasks can be scheduled to commence at some
future time.
Anomaly detection - this mining function expands the mining functions supported by
JDM and provides an important capability for supporting the detection of unusual events.

# posted by Dr. Martin Menzel @ 6:19 AM 0 comments

Monday, June 27, 2005

Artificial Intelligence [CiteSeer; NEC Research Institute; Steve Lawrence, Kurt Bollacker, Lee Giles]

# posted by Dr. Martin Menzel @ 8:47 AM 0 comments

Sunday, May 15, 2005

[Webmining] GeneticProgrammingReconsidered.pdf (application/pdf-Objekt)

GeneticProgrammingReconsidered.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 4:44 AM 0 comments

[Webmining] XML Similarity - 44.pdf (application/pdf-Objekt)

44.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 2:43 AM 0 comments

[Webmining] IBM Research | Almaden Research Center | Projects - Clever

IBM Research | Almaden Research Center | Projects - Clever
Past Project
The CLEVER search engine incorporates several algorithms that make use of the Web's hyperlink structure for discovering high-quality information. It can be exceedingly difficult to locate resources on the World Wide Web that are both high-quality and relevant to a user's informational needs. Traditional automated search methods for locating information on the Web are easily overwhelmed by low-quality and unrelated content. Second generation search engines have to have effective methods for focusing on the most authoritative documents. The rich structure implicit in hyperlinks among Web documents offers a simple, and effective, means to deal with many of these problems.
Additional Information:
Publications:
Link to content outside IBM Authoritative sources in a hyperlinked environment
[To appear in the Journal of the ACM, 1999. Also appears as IBM Research Report RJ 10076, May 1997]
Link to content outside IBM Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text
[Proceedings of the 7th World-Wide Web conference, 1998. Copyright owned by Elsevier Sciences, Amsterdam]
Link to content in pdf format Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies
[VLDB Journal, 1998 (invited)]
Link to content Enhanced hypertext categorization using hyperlinks
[Proceedings of ACM SIGMOD 1998]
Link to content outside IBM Mining the link structure of the World Wide Web
Link to content outside IBM Trawling the Web for emerging cyber-communities
[Eighth World Wide Web conference, Toronto, Canada, May 1999]
Link to content The web as a graph: Measurements, models and methods
[Eighth World Wide Web conference, Toronto, Canada, May 1999]
Link to content Extracting large scale knowledge bases from the web.
[IEEE International conference on Very Large Databases (VLDB), Edinburgh, Scotland]

# posted by Dr. Martin Menzel @ 12:14 AM 0 comments

Saturday, May 14, 2005

[Webmining] p289.pdf (application/pdf-Objekt)

p289.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 10:53 AM 0 comments

Sunday, May 08, 2005

[WebMining] SimFusion.pdf (application/pdf-Objekt)

SimFusion.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 10:30 PM 0 comments

[Webmining] SimFusion.pdf (application/pdf-Objekt)

SimFusion.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 1:24 PM 0 comments

[Webmining][Graphtheory] shamir99faster.pdf (application/pdf-Objekt)

shamir99faster.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 1:33 AM 0 comments

[Webmining][Graphtheory] tsui-2002-02.pdf (application/pdf-Objekt)

tsui-2002-02.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 1:32 AM 0 comments

[Webmining][Graphtheory] cpmproctree.pdf (application/pdf-Objekt)

cpmproctree.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 1:24 AM 0 comments

[Webmining][Graphtheory]Teresa Przytycka

Teresa Przytycka

# posted by Dr. Martin Menzel @ 1:13 AM 0 comments

Saturday, April 30, 2005

A Survey of Web Metrics - Web Mining related paper

dhyani02survey.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 11:44 PM 0 comments

Similarity Queries - Web Mining related paper

cohen99recognizing.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 11:40 PM 0 comments

HTML Similarities - Web Mining related paper

jeh02simrank.pdf (application/pdf-Objekt)

# posted by Dr. Martin Menzel @ 11:37 PM 0 comments

Friday, April 29, 2005

Elsevier.com

# posted by Dr. Martin Menzel @ 4:54 AM 0 comments

Elsevier.com

# posted by Dr. Martin Menzel @ 4:54 AM 0 comments

Author Gateway - Getting Published - LaTeX file guidelines

# posted by Dr. Martin Menzel @ 4:54 AM 0 comments

IEEE Intelligent Systems

# posted by Dr. Martin Menzel @ 4:45 AM 0 comments

AI Reference Shelf

# posted by Dr. Martin Menzel @ 4:40 AM 0 comments

Journal Informations in the Reference List ...

lavrac96intelligent.pdf (application/pdf-Objekt)

Journal Informations in the Reference List ...

# posted by Dr. Martin Menzel @ 4:39 AM 0 comments

Elsevier Author Gateway

# posted by Dr. Martin Menzel @ 4:31 AM 0 comments

Elsevier Author Gateway

# posted by Dr. Martin Menzel @ 4:30 AM 0 comments

Elsevier Author Gateway

# posted by Dr. Martin Menzel @ 4:30 AM 0 comments

Elsevier Author Gateway

# posted by Dr. Martin Menzel @ 4:30 AM 0 comments

Machine Learning and Natural Language Processing Lab

Machine Learning and Natural Language Processing Lab: "Link Statistical Methods in Medical Research"

# posted by Dr. Martin Menzel @ 4:28 AM 0 comments

Potential Utility of Data-Mining Algorithms for Early Detection of Potentially Fatal/Disabling Adverse Drug Reactions: A Retrospective Evaluation -- H

Potential Utility of Data-Mining Algorithms for Early Detection of Potentially Fatal/Disabling Adverse Drug Reactions: A Retrospective Evaluation -- Hauben and Reich 45 (4): 378 -- The Journal of Clinical Pharmacology

# posted by Dr. Martin Menzel @ 4:28 AM 0 comments

Elsevier Author Gateway

# posted by Dr. Martin Menzel @ 4:27 AM 0 comments

ScienceDirect - Artificial Intelligence in Medicine - List of Issues

# posted by Dr. Martin Menzel @ 4:25 AM 0 comments

the-data-mining-blog

Thursday, December 15, 2005

JSR-000247 Data Mining 2.0 - Early Draft Review

Monday, June 27, 2005

Artificial Intelligence [CiteSeer; NEC Research Institute; Steve Lawrence, Kurt Bollacker, Lee Giles]

Sunday, May 15, 2005

[Webmining] GeneticProgrammingReconsidered.pdf (application/pdf-Objekt)

[Webmining] XML Similarity - 44.pdf (application/pdf-Objekt)

[Webmining] IBM Research | Almaden Research Center | Projects - Clever

Saturday, May 14, 2005

[Webmining] p289.pdf (application/pdf-Objekt)

Sunday, May 08, 2005

[WebMining] SimFusion.pdf (application/pdf-Objekt)

[Webmining] SimFusion.pdf (application/pdf-Objekt)

[Webmining][Graphtheory] shamir99faster.pdf (application/pdf-Objekt)

[Webmining][Graphtheory] tsui-2002-02.pdf (application/pdf-Objekt)

[Webmining][Graphtheory] cpmproctree.pdf (application/pdf-Objekt)

[Webmining][Graphtheory]Teresa Przytycka

Saturday, April 30, 2005

A Survey of Web Metrics - Web Mining related paper

Similarity Queries - Web Mining related paper

HTML Similarities - Web Mining related paper

Friday, April 29, 2005

Elsevier.com

Elsevier.com

Author Gateway - Getting Published - LaTeX file guidelines

IEEE Intelligent Systems

AI Reference Shelf

Journal Informations in the Reference List ...

Elsevier Author Gateway

Elsevier Author Gateway

Elsevier Author Gateway

Elsevier Author Gateway

Machine Learning and Natural Language Processing Lab

Potential Utility of Data-Mining Algorithms for Early Detection of Potentially Fatal/Disabling Adverse Drug Reactions: A Retrospective Evaluation -- H

Elsevier Author Gateway

ScienceDirect - Artificial Intelligence in Medicine - List of Issues

About Me

Links

archives