Vol. 26 No. l (June, 2007) |
Vol. 26 No. 2 (Sep., 2007) |
Vol. 26 No. 3 (Dec., 2007) |
VOl. 26 No. 4 (Mar., 2008) |
VOl. 26 No. 5 (June, 2008) |
Vol. 26 No. 6 (Sep., 2008) |
Vol. 26 No. 7 (Dec., 2008) |
Vol. 26 No. 8 (Mar., 2009) |
The database of GAJ is exhibited by the Internet. Using this, the matrix about existence of the standard form in each question and each research point was created and applied Hayashi's Quantification Method Type Three, which is a multivariate analysis.
This result showed the following things about the standard language forms ofJapanese conjugations.
In our analysis, the properties of the ga-marked noun (e.g., kawa) in the target sentence (e.g., Kawa-ga (minami-ni) nagareteita) were encoded by 18 variables in total, and the similarity of the ga-marked nouns in the data was measured using hierarchical cluster analysis. Thus, cluster-wise and form-wise cross tabulations were conducted to evaluate the strength of the correlation between the four inflectional forms and the semantic classes of the ga-marked nouns.
We obtained a good correlation between the semantic classes of the ga-marked nouns and the four inflectional forms of the nagareru. Further analysis indicated a complex interaction between the inflectional forms and the semantic properties of its argument(s). These findings suggest that current theories and models of tense/aspect may suffer from oversimplification because they ignore the co-variation between the argument selection and tense/aspect selection.
This study proposes a new method to apply a multiple logistic regression model for dialectological data, so called glottogram, showing the process of analysis with virtual cases.
A multiple logistic regression model is given as
(1) log{p/(1-p} = Zwhere p is probability, Z is a linear combination shown in the form Z = a1*X1 + a2*X2 + a3*3 + ... + b, and log is the logarithm to base e. The equation (1) can be transformed into equation (2) as follows:
(2) p = 1 / {1+exp(-Z)}
The multiple logistic regression model can be applied for an analysis of the glottogram and other dialectological data considering the factors like age, the point of observations, the case of situations, and other factors denoting them as the variables, X1, X2, X3, etc.
Employing this model, it is possible to analyze two or more factors of dialectological data including those with a nominal scale within one equation, to analyze data excluding disruptors, and to estimate future trend with a high precision even if observed data are not complete, a situation which often faces us. For investigations of written materials it is also possible to analyze a change of vocabulary size, a change of ratios of word types, or diverse language change with factors such as genre of the text or a gender of the author of the text.
As to the quantity of Japanese data which search engines store and use, Yahoo! was found to surpass Google and rank the first among the major search engines which can handle Japanese. The data size of Yahoo! was estimated to be about thirty terabytes (3 x 1013 bytes), which is comparable to the amount of thirty thousand years articles of a major Japanese newspaper.
With regard to the quality, or reliability, of search results, Yahoo! and Google were compared in terms of logical consistency and stability of search results. As a result, Google was found to show a considerably high degree both of logical inconsistency and of instability, which was not observed in the case of Yahoo!.
From these observations, it may safely be concluded that Yahoo! is the best search engine available at present if judged from the viewpoint of those who want to use search engines to search Japanese expressions and view the number of hits.
The overall tendencies of the surveys showed that linguistic change construct a S-shaped curve. When a theoretical curve based on this idea was applied, years necessary for a linguistic change can be calculated. According to a previous study, more than one hundred years were supposed to be necessary from the beginning to the end of a linguistic change. More data were acquired this time and a newer technique of calculation was applied to reconsider the former study result. The result showed that nearly two hundred years is necessary for aggregate changes to complete.
Here two findings obtained since that time are reported. The first is that the instability of the Google search is even more serious than was found earlier. The second finding is that the total quantity of Japanese Web documents was underestimated slightly in my previous paper and requires re-estimation.