[collect]用js的方式采集Google的URL

Google hack-简单页面URL采集

采集Google的URL方式多重多样，最简单的方式莫过于js直接获取节点了。比如：

var h3 = document.getElementsByTagName('h3');
for(var i=0;i<h3.length;i++){
    var a = h3[i]. getElementsByTagName('a');
    console.log(a[0].href);
}

在Chrome浏览器中，按下F12打开其中的“Console”，然后将上面的代码贴入，按下Enter键执行即可看到效果。

在java里面用jsoup也可以非常简单的获取到搜索结果的URL：

public static void main(String[] args) throws IOException {
	Document doc = Jsoup.connect("https://www.google.ws/search?num=100&site=&source=hp&q=filetype%3Ajsp&oq=filetype%3Ajsp&gs_l=hp.3...8115.14780.0.15194.22.21.1.0.0.0.523.5187.3j3j3j5j4j1.19.0....0...1c.1.36.hp..14.8.1440.P_2EQhc7Pz0").userAgent("Googlebot/2.1 (+http://www.googlebot.com/bot.html)").timeout(5000).get();
	Elements element = doc.getElementsByTag("h3");
	for (Element e : element) {
		Matcher m= Pattern.compile("/url\?q=(.*)&sa").matcher(e.getElementsByTag("a").get(0).attr("href"));
		if(m.find()){
			System.out.println(URLDecoder.decode(m.group(1),"UTF-8"));
		}
	}
}

正则的方式：

package org.javaweb.test;
   
import java.util.regex.Matcher;
import java.util.regex.Pattern;
   
public class TestReg {
       
    public static void main(String[] args) {
        String source="<h3 class="r"><a href="http://baidu.com">百度</a></h3><h3 class="r"><a href="http://google.com">谷歌</a></h3> ";
        StringBuilder resultComment=new StringBuilder();
        StringBuilder resultName=new StringBuilder();
        System.out.println("=======开始匹配========");
        String patternStrs="(<h3 class="r"><a.+?)href="(.+?)">(.+?)(</a></h3>)";
        Pattern pattern=Pattern.compile(patternStrs);
        Matcher matcher=pattern.matcher(source);
        while(matcher.find()){
            resultName.append(matcher.group(2)+"n");
            resultComment.append(matcher.group(3)+"n");
        }
        System.out.println("=======标签内内容=======");
        System.out.println(resultComment.toString());
        System.out.println("=======name属性值=======");
        System.out.println(resultName.toString());
    }
}

原文链接：

http://p2j.cn/?p=807

[collect]用js的方式采集Google的URL

Google hack-简单页面URL采集

在java里面用jsoup也可以非常简单的获取到搜索结果的URL：

正则的方式：

原文链接：

Trending Articles

Girasoles para colorear

mayabang Quotes, Torpe Quotes, tanga Quotes

Tagalog Quotes About Crush – Tagalog Love Quotes

OFW quotes : Pinoy Tagalog Quotes

Long Distance Relationship Tagalog Love Quotes

Tagalog Quotes To Move on and More Love Love Love Quotes

5 Tagalog Relationship Rules

Best Crush Tagalog Quotes And Sayings 2017

Re:Mutton Pies (lleechef)

FORECLOSURE OF REAL ESTATE MORTGAGE

Sapos para colorear

tagalog love Quotes – Tiwala Quotes

Break up Quotes Tagalog Love Quote – Broken Hearted Quotes Tagalog

Patama Quotes : Tagalog Inspirational Quotes

Pamatay na Banat and Mga Patama Love Quotes

Tagalog Long Distance Relationship Love Quotes

BARKADA TAGALOG QUOTES

“BAHAY KUBO HUGOT”

Vimeo 10.7.0 by Vimeo.com, Inc.

Vimeo 10.7.1 by Vimeo.com, Inc.