Skip to content

Commit 6b6ccec

Browse files
committed
Initial commit
0 parents  commit 6b6ccec

33 files changed

+2566
-0
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Project exclude paths
2+
/target/

.idea/$PRODUCT_WORKSPACE_FILE$

+19
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.idea/.gitignore

+2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.idea/compiler.xml

+13
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.idea/misc.xml

+14
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2019 Stefan Welcker
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

+95
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
![csplogo](https://user-images.githubusercontent.com/12301571/67168219-4d618900-f3a2-11e9-9460-b79eff997c35.PNG)
2+
3+
# cmd.csp.similarity
4+
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5+
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/swelcker/cmd.csp.similarity/graphs/commit-activity)
6+
[![GitHub release](https://img.shields.io/github/release/swelcker/cmd.csp.similarity.svg)](https://GitHub.com/swelcker/cmd.csp.similarity/releases/)
7+
[![GitHub tag](https://img.shields.io/github/tag/swelcker/cmd.csp.similarity.svg)](https://GitHub.com/swelcker/cmd.csp.similarity/tags/)
8+
[![GitHub commits](https://img.shields.io/github/commits-since/swelcker/cmd.csp.similarity/master.svg)](https://GitHub.com/swelcker/cmd.csp.similarity/commit/)
9+
[![GitHub contributors](https://img.shields.io/github/contributors/swelcker/cmd.csp.similarity.svg)](https://GitHub.com/swelcker/cmd.csp.similarity/graphs/contributors/)
10+
11+
A library implementing different string similarity and distance measures for ease of use. A dozen of algorithms (including Levenshtein edit distance and sibblings, Jaro-Winkler, Longest Common Subsequence, cosine similarity etc.) are currently implemented.
12+
Used in the Cognitive Service Platform cmd.csp for NLP and classifier part.
13+
14+
15+
### Prerequisites
16+
17+
There are no prerequisites.
18+
19+
Included dependencies:
20+
```xml
21+
<dependency>
22+
<groupId>net.jcip</groupId>
23+
<artifactId>jcip-annotations</artifactId>
24+
<version>1.0</version>
25+
</dependency>
26+
```
27+
### Installing/Usage
28+
29+
To use, merge the following into your Maven POM (or the equivalent into your Gradle build script):
30+
31+
```xml
32+
<repository>
33+
<id>github</id>
34+
<name>GitHub swelcker Apache Maven Packages</name>
35+
<url>https://maven.pkg.github.com/swelcker</url>
36+
</repository>
37+
38+
<dependency>
39+
<groupId>cmd.csp</groupId>
40+
<artifactId>cspsimilarity</artifactId>
41+
<version>1.0.0</version>
42+
</dependency>
43+
```
44+
45+
Then, import cmd.csp.postagger.*;` in your application :
46+
47+
```java
48+
// Example
49+
import cspsimilarity.*;
50+
...
51+
private NormalizedLevenshtein engineNL = new NormalizedLevenshtein();
52+
private JaroWinkler engineJW = new JaroWinkler();
53+
private MetricLCS engineMLCS = new MetricLCS();
54+
private NGram engineNGRAM = new NGram(3);
55+
private Cosine engineCOSINE = new Cosine(9);
56+
private Jaccard engineJACARD = new Jaccard(9);
57+
private SorensenDice engineSOREDICE= new SorensenDice(9);
58+
...
59+
String source = (sourceText);
60+
String search = (toSearch);
61+
62+
double sS=0d;
63+
64+
sS=(engineNL.similarity(source, search));
65+
sS=(engineJW.similarity(source, search));
66+
sS=(1d-engineMLCS.distance(source, search));
67+
sS=(1d-engineNGRAM.distance(source, search));
68+
sS=(engineCOSINE.similarity(source, search));
69+
sS=(engineJACARD.similarity(source, search));
70+
sS=(engineSOREDICE.similarity(source, search));
71+
```
72+
73+
## Built With
74+
75+
* [Maven](https://maven.apache.org/) - Dependency Management
76+
77+
78+
## Contributing
79+
80+
Please read [CONTRIBUTING.md](https://gist.github.com/PurpleBooth/b24679402957c63ec426) for details on our code of conduct, and the process for submitting pull requests to us.
81+
82+
## Versioning
83+
84+
We use [SemVer](http://semver.org/) for versioning. For the versions available, see the [tags on this repository](https://github.com/swelcker/cmd.csp.similarity/tags).
85+
86+
## Authors
87+
88+
* **Stefan Welcker** - *Modifications based on tdebatty/java-string-similarity*
89+
90+
See also the list of [contributors](https://github.com/swelcker/cmd.csp.stemmer/contributors) who participated in this project.
91+
92+
## License
93+
94+
This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details
95+

cmd.csp.similarity.iml

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<module type="JAVA_MODULE" version="4" />

dependency-reduced-pom.xml

+95
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
3+
<modelVersion>4.0.0</modelVersion>
4+
<groupId>cmd.csp</groupId>
5+
<artifactId>cspsimilarity</artifactId>
6+
<version>1.0.0</version>
7+
<developers>
8+
<developer>
9+
<name>Stefan Welcker</name>
10+
<email>[email protected]</email>
11+
</developer>
12+
</developers>
13+
<licenses>
14+
<license>
15+
<name>MIT License</name>
16+
<url>http://www.opensource.org/licenses/mit-license.php</url>
17+
</license>
18+
</licenses>
19+
<build>
20+
<plugins>
21+
<plugin>
22+
<artifactId>maven-compiler-plugin</artifactId>
23+
<version>3.8.1</version>
24+
<configuration>
25+
<source>1.8</source>
26+
<target>1.8</target>
27+
</configuration>
28+
</plugin>
29+
<plugin>
30+
<artifactId>maven-source-plugin</artifactId>
31+
<version>3.1.0</version>
32+
<executions>
33+
<execution>
34+
<id>attach-sources</id>
35+
<goals>
36+
<goal>jar-no-fork</goal>
37+
</goals>
38+
</execution>
39+
</executions>
40+
</plugin>
41+
<plugin>
42+
<artifactId>maven-release-plugin</artifactId>
43+
<version>2.5.3</version>
44+
<configuration>
45+
<tagNameFormat>v@{project.version}</tagNameFormat>
46+
</configuration>
47+
</plugin>
48+
<plugin>
49+
<artifactId>maven-shade-plugin</artifactId>
50+
<version>3.2.1</version>
51+
<executions>
52+
<execution>
53+
<phase>package</phase>
54+
<goals>
55+
<goal>shade</goal>
56+
</goals>
57+
</execution>
58+
</executions>
59+
</plugin>
60+
</plugins>
61+
</build>
62+
<repositories>
63+
<repository>
64+
<snapshots>
65+
<enabled>false</enabled>
66+
</snapshots>
67+
<id>central</id>
68+
<name>Maven Repository Switchboard</name>
69+
<url>http://repo1.maven.org/maven2</url>
70+
</repository>
71+
</repositories>
72+
<pluginRepositories>
73+
<pluginRepository>
74+
<releases>
75+
<updatePolicy>never</updatePolicy>
76+
</releases>
77+
<snapshots>
78+
<enabled>false</enabled>
79+
</snapshots>
80+
<id>central</id>
81+
<name>Maven Plugin Repository</name>
82+
<url>http://repo1.maven.org/maven2</url>
83+
</pluginRepository>
84+
</pluginRepositories>
85+
<distributionManagement>
86+
<repository>
87+
<id>github</id>
88+
<name>GitHub swelcker Apache Maven Packages</name>
89+
<url>https://maven.pkg.github.com/swelcker/cmd.csp.similarity</url>
90+
</repository>
91+
</distributionManagement>
92+
<properties>
93+
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
94+
</properties>
95+
</project>

0 commit comments

Comments
 (0)